(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property Organization 
International Bureau 

(43) International Publication Date 
12 December 2002 (12.12.2002) 




PCT 



Hill 


iiiiniiioiiiii 


mini 


Hill 


iiiiiii 


HI 



(10) International Publication Number 

WO 02/098358 A2 



(51) International Patent Classification 7 : 



A61K 



(21) International Application Number: PCT/US02/ 17594 



(22) International Filing Date: 4 June 2002 (04.06.2002) 



(25) Filing Language: 



(26) Publication Language: 



English 



English 



(30) Priority Data: 






60/295,917 


4 June 2001 (04.06.2001) 


US 


60/350,666 


13 November 2001 (13.11.2001) 


US 


60/368,689 


29 March 2002 (29.03.2002) 


US 


60/372,246 


12 April 2002 (12.04.2002) 


US 


10/160,233 


31 May 2002 (31.05.2002) 


US 



(71) Applicant: EOS BIOTECHNOLOGY, INC. [US/US]; 
225 A Gateway Boulevard, South San Francisco, CA 94080 
(US). 

(72) Inventors: AFAR, Daniel, E., H.; 435 Visitacion Avenue, 
Brisbane, CA 94005 (US). AGUS, David; 522 North Cres- 
cent Drive, Beverly Hills, CA 90210 (US). MACK, David, 
H.; 2076 Monterey Avenue, Menlo Park, CA 94025 (US). 



(74) Agents: BASTIAN, Kevin, L. et ah; Townsend and 
Townsend and Crew LLP, Two Embarcadero Center, 
Eighth Floor, San Francisco, C A 941 11 (US). 

(81) Designated States (national): AE, AG, AL, AM, AT, AU, 
AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU, 
CZ, DE, DK, DM, DZ, EC, EE, ES, FT, GB, GD, GE, GH, 
GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, 
LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, 
MX, MZ, NO, NZ, OM, PH, PL, PT, RO, RU, SD, SE, SG, 
SI, SK, SL, TJ, TM, TN, TR, TT, TZ, UA, UG, UZ, VN, 
YU, ZA, ZM, ZW. 

(84) Designated States (regional): ARIPO patent (GH, GM, 
KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW), 
Eurasian patent (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), 
European patent (AT, BE, CH, CY, DE, DK, ES, FT, FR, 
GB, GR, IE, IT, LU, MC, NL, PT, SE, TR), OAPI patent 
(BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, ML, MR, 
NE, SN, TD, TG). 

Published: 

— without international search report and to be republished 
upon receipt of that report 

For two-letter codes and other abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations" appearing at the begin- 
ning of each regular issue of the PCT Gazette. 



00 

m — — 

0O (54) Title: METHODS OF DIAGNOSIS AND TREATMENT OF ANDROGEN -DEPENDENT PROSTATE CANCER, 
0\ PROSTATE CANCER UNDERGOING ANDROGEN- WITHDRAWAL, AND ANDROGEN-INDEPENDENT PROSTATE 
^ CANCER 
Cs| 

(57) Abstract: Described herein are genes whose expression are up-regulated or down-regulated in prostate cancer. Also described 
aTe sucn genes whose expression is further up-regulated or down -regulated in drug-resistant prostate cancer cells. Related methods 
and compositions that can be used for diagnosis and treatment of prostate cancer are disclosed. Also described herein are methods 
that can be used to identify modulators of prostate cancer. 



BNSDOCID: <WO 02098358A2J_> 



WO 02/098358 PCT/USO 2/1 7594 



METHODS OF DIAGNOSIS AND TREATMENT OF ANDROGEN-DEPENDENT 
PROSTATE CANCER, PROSTATE CANCER UNDERGOING ANDROGEN 
WITHDRAWAL, AND ANDROGEN-INDEPENDENT PROSTATE CANCER 

5 

CROSS-REFERENCES TO RELATED APPLICATIONS 
This application claims priority from the following applications: USSN 60/295,917, 
filed June 4, 2001, USSN 60/368,689, filed March 29, 2002; USSN 60/350,666, filed 
November 13, 2001; and USSN 60/372,246, filed April 12, 2002; each of which is 
10 incorporated herein by reference in its entirety. 

FIELD OF THE INVENTION 
The invention relates to the identification of nucleic acid and protein expression 
profiles and nucleic acids, products, and antibodies thereto that are involved in prostate 
15 cancer; and to the use of such expression profiles and compositions in the diagnosis, 
prognosis, and therapy of prostate cancer. The invention further relates to methods for 
identifying and using agents and/or targets that inhibit prostate cancer. 

BACKGROUND OF THE INVENTION 

20 Prostate cancer is the most frequently diagnosed cancer and the second leading cause 

of male cancer death in North America and northern Europe. Early detection of prostate 
cancer using a serum test for prostate-specific antigen (PSA) has dramatically improved the 
treatment of the disease (Oesterling (1992) J. Am. Med. Assoc. 267:2236-2238). Treatment 
of prostate cancer consists largely of surgical prostatectomy, radiation therapy, androgen 

25 ablation therapy and chemotherapy. Although many prostate cancer patients are effectively 
treated, the current therapies can all induce serious side effects which diminish quality of life. 
Patients who present with metastatic disease are most often treated with androgen-ablation 
therapy. Hormone blockade results in significant regression of the tumor. However, this 
treatment rarely cures the patient and* invariably results in progression to androgen- 
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independent disease, which is incurable. Afrin and Stuart (1994) J.S.C. Med. Assoc. 90:231- 
236. 

The identification of novel therapeutic targets and diagnostic markers is essential for 
improving the current treatment of prostate cancer patients. Recent advances in molecular 
5 medicine have increased the interest in tumor-specific cell surface antigens that could serve 
as targets for various immunotherapeutic or small molecule strategies. Antigens suitable for 
immunotherapeutic strategies should be highly expressed in cancer tissues and ideally not 
expressed in normal adult tissues. Expression in tissues that are dispensable for life, 
however, may be tolerated. Examples of such antigens include Her2/neu and the B-cell 

10 antigen CD20. Humanized monoclonal antibodies directed to Her2/neu (Herceptin) are 

currently in use for the treatment of metastatic breast cancer. Ross and Fletcher (1998) Stem 
Cells 16:413-428. Similarly, anti-CD20 monoclonal antibodies (Rituxin) are used to 
effectively treat non-Hodgkin's lymphoma. Maloney, et al. (1997) Blood 90:2188-2195; 
Leget and Czuczman (1998) Curr. Opin. Oncol 10:548-551. 

15 Several potential immunotherapeutic targets have been identified for prostate cancer. 

They include prostate-specific membrane antigen (PSMA) (Israeli, et al. (1993) Cancer Res. 
53:227-230), prostate stem cell antigen (PSCA; Reiter, et al. (1998) Proc. Natl. Acad. Sci. 
USA 95:1735-1740), and serpentine transmembrane epithelial antigen of the prostate 
(STEAP; Hubert, et al. (1999) Proc. Natl. Acad. Sci. USA 96:14529-14534). PSMA is a type 

20 II transmembrane hydrolase with significant homology to a rat neuropeptidase (Carter, et al. 
(1996) Proc. Natl. Acad. Sci. USA 93:749-753). Antibodies directed towards PSMA are 
currently being used to detect metastasized prostate cancer as the Prostascint Scan (Sodee, et 
al. (1996) Clin. Nucl. Med. 21 :759-767) and are also being evaluated for treatment of 
advanced disease (Gregorakis, et al. (1998) Semm. Tir ol. Oncol. 16:2-12; Liu, et al. (1998) 

25 Cancer Res. 58:4055-4060; Muiphy, et al. (1998) J. Urol. 160:2396-2401). In a study on 

bone metastasis of prostate cancer, only 8 out of 18 patient samples expressed PSMA (Silver, 
et al. (1997) Clin. Cancer Res. 3:81-85). Therefore, it is clear that other targets need to be 
identified to manage metastasized disease. PSCA is a member of the Thy-l/Ly-6 family of 
glycosylphosphatidylinositol-linked plasma membrane proteins (Reiter, et al. (1998) Proc. 

30 Natl. Acad. Sci. USA 95: 1735-1740). Immunohistochemical data shows that PSCA is up- 
regulated in the majority of prostate cancer epithelia and is also detected in bone metastasis 
(Gu, et al. (2000) Oncogene 19:1288-1296). Recent work shows that antibodies directed to 

2 
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PSCA can prevent metastatic spread of prostate cancer in a mouse model (Saffran, et al. 
(2001) Proc. Natl. Acad. Sci. USA 98:2658-2663). STEAP is a midti-transmembrane 
prostate-specific protein that may function as a channel or transporter protein (Hubert, et al. 
(1999) Proc. Natl. Acad. Sci. USA 96:14529-14534). Its protein expression is specific to the 
5 basolateral membranes of normal prostate and prostate cancer epithelia. STEAP expression 
was most highly concentrated at cell-cell boundaries, implying a potential function in 
intercellular communication. Therapeutic monoclonal antibodies have so far not been 
reported for STEAP. 

10 SUMMARY OF THE INVENTION 

The present invention therefore provides nucleotide sequences of genes that are up- 
and down-regulated in androgen-independent prostate cancer cells or prostate cells 
undergoing androgen withdrawal. Such genes are useful for diagnostic purposes, and also as 
targets for screening for therapeutic compounds that modulate prostate cancer, such as 
15 hormones or antibodies. Other aspects of the invention will become apparent to the skilled 
artisan by the following description of the invention. 

In one aspect, the present invention provides a method of detecting an androgen 
independent prostate cancer-associated transcript in a cell from a patient, the method 
comprising contacting a biological sample from the patient with a polynucleotide that 
20 selectively hybridizes to nucleic acid molecule comprising a sequence at least 80% identical 
to a sequence as shown in Tables 1 A-4. 

In one embodiment, the present invention provides a method of determining the level 
- of a prostate cancer associated transcript in a cell from a patient. 

In one embodiment, the present invention provides a method of detecting a prostate 
25 cancer-associated transcript in a cell from a patient, the method comprising contacting a 
biological sample from the patient with a polynucleotide that selectively hybridizes to a 
sequence at least 80% identical to a sequence as shown in Tables 1 A-4. 

In various embodiments, the polynucleotide selectively hybridizes to a sequence at 
least 95% identical to a sequence as shown in Tables 1 A-4; the polynucleotide comprises a 
30 sequence as shown in Tables 1 A-4; the biological sample is a tissue sample; the biological 

sample comprises isolated nucleic acids, e.g., mRNA; the polynucleotide is labeled, e.g., with 
a fluorescent label; the polynucleotide is immobilized on a solid surface; the patient is 
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undergoing a therapeutic regimen to treat prostate cancer; the patient is suspected of having 
metastatic prostate cancer; the patient is a human; the patient is suspected of having a taxol- 
resistant cancer; or the prostate cancer associated transcript is raRNA. 

In other embodiments, the method further comprises the step of amplifying nucleic 
5 acids before the step of contacting the biological sample with the polynucleotide. 

In another aspect, the present invention provides a method of monitoring the efficacy 
of a therapeutic treatment of prostate cancer, the method comprising the steps of; (i) 
providing a biological sample from a patient undergoing the therapeutic treatment; and (ii) 
determining the level of a prostate cancer-associated transcript in the biological sample by 
10 contacting the biological sample with a polynucleotide that selectively hybridizes to a 

sequence at least 80% identical to a sequence as shown in Tables 1A-4, thereby monitoring 
the efficacy of the therapy. In a further embodiment, the patient has metastatic prostate 
cancer. In a further embodiment, the patient has a drug resistant (e.g., taxol resistant) form of 
prostate cancer. 

15 In one embodiment, the method further comprises the step of: (iii) comparing the 

level of the prostate cancer-associated transcript to a level of the prostate cancer-associated 
transcript in a biological sample from the patient prior to, or earlier in, the therapeutic 
treatment. 

Additionally, provided herein is a method of evaluating the effect of a candidate 
20 prostate cancer drug comprising administering the drug to a patient and removing a cell 

sample from the patient. The expression profile of the cell is then determined. This method 
may further comprise comparing the expression profile to an expression profile of a healthy 
individual. In a preferred embodiment, said expression profile includes a gene of Tables 1A- 
4. 

25 In one aspect, the present invention provides an isolated nucleic acid molecule 

consisting of a polynucleotide sequence as shown in Tables 1 A-4. 

In one embodiment, an expression vector or cell comprises the isolated nucleic acid. 
In one aspect, the present invention provides an isolated polypeptide which is encoded 
by a nucleic acid molecule having polynucleotide sequence as shown in Tables 1 A-4. 
30 In another aspect, the present invention provides an antibody that specifically binds to 

an isolated polypeptide which is encoded by a nucleic acid molecule having polynucleotide 
sequence as shown in Tables 1 A-4. 
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In certain embodiments, the antibody is conjugated to an effector component, e.g., a 
fluorescent label, a radioisotope or a cytotoxic chemical; the antibody is an antibody 
fragment; or the antibody is humanized. 

In one aspect, the present invention provides a method of detecting a prostate cancer 
5 cell in a biological sample from a patient, the method comprising contacting the biological 
sample with an antibody as described herein. 

In another aspect, the present invention provides a method of detecting antibodies 
specific to prostate cancer in a patient, the method comprising contacting a biological sample 
from the patient with a polypeptide encoded by a nucleic acid comprising a sequence from 
10 Tables 1A-4. 

In another aspect, the present invention provides a method for identifying a compound 
that modulates a prostate cancer-associated polypeptide, the method comprising the steps of: 
a) contacting the compound with a prostate cancer-associated polypeptide, the polypeptide 
encoded by a polynucleotide that selectively hybridizes to a sequence at least 80% identical 
15 to a sequence as shown in Tables 1A-4; and b) determining the functional effect of the 
compound upon the polypeptide. 

In one embodiment, the functional effect is a physical effect, an enzymatic effect, or a 
chemical effect. 

In one embodiment, the polypeptide is expressed in a eukaryotic host cell or cell 
20 membrane. In another embodiment, the polypeptide is recombinant. 

La one embodiment, the functional effect is determined by measuring ligand binding 
to the polypeptide. 

In another aspect, the present invention provides a method of inhibiting proliferation 
of a prostate cancer-associated cell to treat prostate cancer in a patient, the method 
25 comprising the step of administering to the subject a therapeutically effective amount of a 
compound identified as described herein. 

In one embodiment, the compound is an antibody. 

In another aspect, the present invention provides a drug screening assay comprising 
the steps of: a) administering a test compound to a mammal having prostate cancer or to a 
30 cell sample isolated therefrom; b) comparing the level of gene expression of a polynucleotide 
that selectively hybridizes to a sequence at least 80% identical to a sequence as shown in 
Tables 1 A-4 in a treated cell or mammal with the level of gene expression of the 

5 
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polynucleotide in a control cell sample or mammal, wherein a test compound that modulates 
the level of expression of the polynucleotide is a candidate for the treatment of prostate 
cancer. 

In one embodiment, the control is a mammal with prostate cancer or a cell sample 
5 therefrom that has not been treated with the test compound. In another embodiment, the 
control is a normal cell or mammal. 

In one embodiment, the test compound is administered in varying amounts or 
concentrations. In another embodiment, the test compound is administered foT varying time 
periods. In another embodiment, the comparison can occur after addition or removal of the 
10 drug candidate. 

In one embodiment, the levels of a plurality of polynucleotides that selectively 
hybridize to a sequence at least 80% identical to a sequence as shown in Tables 1 A-4 are 
individually compared to their respective levels in a control cell sample or mammal. In a 
preferred embodiment the plurality of polynucleotides is from three to ten. 
15 In another aspect, the present invention provides a method for treating a mammal 

having prostate cancer comprising administering a compound identified by the assay 
described herein. 

In another aspect, the present invention provides a pharmaceutical composition for 
treating a mammal having prostate cancer, the composition comprising a compound 
20 identified by the assay described herein and a physiologically acceptable excipient. 

In one aspect, the present invention provides a method of screening drug candidates 
by providing a cell expressing a gene that is up- and down-regulated as in a prostate cancer. 
In one embodiment, a gene is selected from Tables 1 A-4. The method further includes 
adding a drug candidate to the cell and determining the effect of the drug candidate on the 
25 expression of the expression profile gene. 

In one embodiment, the method of screening drug candidates includes comparing the 
level of expression in the absence of the drug candidate to the level of expression in the 
presence of the drug candidate, wherein the concentration of the drug candidate can vary 
when present, and wherein the comparison can occur after addition or removal of the drug 
30 candidate. In a preferred embodiment, the cell expresses at least two expression profile 
genes. The profile genes may show an increase or decrease. 
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Also provided is a method of evaluating the effect of a candidate prostate cancer drug 
comprising administering the drug to a transgenic animal expressing or over-expressing the 
prostate cancer modulatory protein, or an animal lacking the prostate cancer modulatory 
protein, for example as a result of a gene knockout. 
5 Moreover, provided herein is a biochip comprising one or more nucleic acid segments 

of Tables 1A-4, wherein the biochip comprises fewer than 1000 nucleic acid probes. 
Preferably, at least two nucleic acid segments are included. More preferably, at least three 
nucleic acid segments are included. 

Furthermore, a method of diagnosing a disorder associated with prostate cancer is 

10 provided. The method comprises determining the expression of a gene of Tables 1 A-4, in a 
first tissue type of a first individual, and comparing the distribution to the expression of the 
gene from a second normal tissue type from the first individual or a second unaffected 
individual. A difference in the expression indicates that the first individual has a disorder 
associated with prostate cancer. 

15 In a further embodiment, the biochip also includes a polynucleotide sequence of a 

gene that is not up- and down-regulated in prostate cancer. 

In one embodiment a method for screening for abioactive agent capable of interfering 
with the binding of a prostate cancer modulating protein (prostate cancer modulatory protein) 
or a fragment thereof and an antibody which binds to said prostate cancer modulatory protein 

20 or fragment thereof. In a preferred embodiment, the method comprises combining a prostate 
cancer modulatory protein or fragment thereof, a candidate bioactive agent and an antibody 
which binds to said prostate cancer modulatory protein or fragment thereof. The method 
further includes determining the binding of said prostate cancer modulatory protein or 
fragment thereof and said antibody. Wherein there is a change in binding, an agent is 

25 identified as an interfering agent. The interfering agent can be an agonist or an antagonist. 
Preferably, the agent inhibits prostate cancer. 

Also provided herein are methods of eliciting an immune response in an individual. 
In one embodiment a method provided herein comprises administering to an individual a 
composition comprising a prostate cancer modulating protein, or a fragment thereof. In 

30 another embodiment, the protein is encoded by a nucleic acid selected from those of Tables 
1A-4. 
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Further provided herein are compositions capable of eliciting an immune response in 
an individual. In one embodiment, a composition provided herein comprises a prostate 
cancer modulating protein, preferably encoded by a nucleic acid of Tables 1A-4, or a 
fragment thereof, and a pharmaceutically acceptable carrier. In another embodiment, said 
5 composition comprises a nucleic acid comprising a sequence encoding a prostate cancer 
modulating protein, preferably selected from the nucleic acids of Tables lA-4and a 
pharmaceutically acceptable carrier. 

Also provided are methods of neutralizing the effect of a prostate cancer protein, or a 
fragment thereof, comprising contacting an agent specific for said protein with said protein in 

10 an amount sufficient to effect neutralization. In another embodiment, the protein is encoded 
by a nucleic acid selected from those of Tables 1 A-4. In another aspect of the invention, a 
method of treating an individual for prostate cancer is provided, hi one embodiment, the 
method comprises administering to said individual an inhibitor of a prostate cancer 
modulating protein, hi another embodiment, the method comprises administering to a patient 

15 having prostate cancer an antibody to a prostate cancer modulating protein conjugated to a 
therapeutic moiety. Such a therapeutic moiety can be a cytotoxic agent or a radioisotope. 

DETAILED DESCRIPTION OF THE INVENTION 
Li accordance with the objects outlined above, the present invention provides novel 
20 methods for diagnosis and evaluation of androgen-dependent prostate cells (malignant or 
non-malignant), prostate cells undergoing androgen withdrawal, and androgen-independent 
prostate cancer, as well as methods for treating androgen-dependent prostate cells (malignant 
or non-malignant), prostate cancer undergoing androgen withdrawal, and androgen- 
independent prostate cancer. The current Specification incorporates the text of USSN 
25 09/976,858, filed October 12, 2001, USSN 60/295,917, filed June 4, 2001, USSN 

60/368,689, filed March 29, 2002; USSN 60/350,666, filed November 13, 2001; and USSN 
60/372,246, filed April 12, 2002. 

Table 1 A provides unigene cluster identification numbers for the nucleotide sequence 
of genes that exhibit increased or decreased expression in androgen-independent prostate 
30 cancer samples. Table 1 A also provides an exemplar accession number that provides a 

nucleotide sequence that is part of the unigene cluster. The expression patterns of the genes 
of Table 1A can be broadly defined into the following categories: 
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Genes that are expressed early in the time course, then drop off in expression, and then 
express again with emergence of androgen-independence (hi-lo-hi pattern in table 1A). 
Genes that are expressed early in the time course, then drop off in expression, and do not 
express again with emergence of androgen-independence (hi-lo-lo pattern in 1 A). Genes that 
5 are not expressed early in the time course, but express only with emergence of androgen- 
independence (lo-lo-hi pattern in table 1 A). Genes that are not expressed early in the time 
course, but then express as androgen is withdrawn and continue to express with emergence of 
androgen-independence (lo-hi-hi pattern in table 1 A). Genes that are not expressed early in 
the time course, but then express as androgen is withdrawn and drop off again with 

10 emergence of androgen-independence (lo-hi-lo pattern in table 1A). 

Tables 2A-C provide unigene cluster identification numbers for the nucleotide 
sequence of genes that exhibit increased or decreased expression in androgen-dependent 
prostate cancer, prostate cancer undergoing androgen withdrawal and androgen-independent 
prostate cancer. Tables 2A-C also provide an exemplar accession number that provides a 

15 nucleotide sequence that is part of the unigene cluster. The expression patterns of the genes 
of Tables 2A-C can be broadly defined into the following 6 categories: 

Genes that are expressed early in the time course of androgen withdrawal, then drop 
off in expression, and then express again with emergence of androgen-independence (hi-lo- 
lo-hi pattern in Table 2A). Genes that are expressed early in the time course, then drop off in 

20 expression immediately after androgen- withdrawal, and do not express again with emergence 
of androgen-independence (hi-lo-lo-lo pattern in Table 2A). Genes that are expressed early 
in the time course, then drop off in expression after several days of androgen withdrawal, and 
do not express again with emergence of androgen-independence (hi-hi-lo-lo pattern in Table 
2A). Genes that are not expressed early in the time course, but express only with emergence 

25 of androgen-independence (lo-lo-lo-hi pattern in Table 2A). Genes that are not expressed 
early in the time course, but then express as androgen is withdrawn and continue to express 
with emergence of androgen-independence (lo-lo-hi-hi pattern in Table 2A). Genes that are 
not expressed early in the time course, but thai express as androgen is withdrawn and drop 
off again with emergence of androgen-independence (lo-lo-hi-lo pattern in Table 2A). 

30 

Definitions 

9 
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The term "androgen ablation therapy 35 refers to techniques for the removal or 
destruction of sources of male hormones, such as testosterone. These techniques include, for 
example, 1) surgical removal of the testicles, 2) medications such as gonadatropin releasing 
hormone analogs that inhibit testosterone production, or 3) anti-androgenic drugs that block 
androgen receptors. 

The term "androgen-independent prostate cancer protein" or "androgen-independent 
prostate cancer polynucleotide" or "androgen-independent prostate cancer-associated 
transcript" refers to nucleic acid and polypeptide polymorphic variants, alleles, mutants, and 
interspecies homologues that: (1) have a nucleotide sequence that has greater than about 60% 
nucleotide sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 
94%, 95%, 96%, 97%, 98% or 99% or greater nucleotide sequence identity, preferably over a 
region of over a region of at least about 25, 50, 100, 200, 500, 1000, or more nucleotides, to a 
nucleotide sequence of or associated with a unigene cluster of Tables 1 A-4; (2) bind to 
antibodies, e.g., polyclonal antibodies, raised against an immunogen comprising an amino 
acid sequence encoded by a nucleotide sequence of or associated with a unigene cluster of 
Tables 1 A-4 and conservatively modified variants thereof; (3) specifically hybridize under 
stringent hybridization conditions to a nucieic acid sequence, or the complement thereof of 
Tables 1 A-4 and conservatively modified variants thereof; or (4) have an amino acid 
sequence that has greater than about 60% amino acid sequence identity, 65%, 70%, 75%, 
80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or greater 
amino sequence identity, preferably over a region of over a region of at least about 25, 50, 
100, 200, 500, 1000, or more amino acid, to an amino acid sequence encoded by a nucleotide 
sequence of or associated with a unigene cluster of Tables 1A-4. These polynucleotides or 
proteins may also be expressed during a period following androgen withdrawal. A 
polynucleotide or polypeptide sequence is typically from a mammal including, but not 
limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or 
other mammal. A "prostate cancer polypeptide" and a "prostate cancer polynucleotide," 
include both naturally occurring or recombinant forms, and may refer to those polypeptides 
or polynucleotides which are expressed in prostate proliferative cells. 

A "full length" prostate cancer protein or nucleic acid refers to a prostate cancer 
polypeptide or polynucleotide sequence, or a variant thereof, that contains the elements 
normally contained in one or more naturally occurring, wild type prostate cancer 

10 
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polynucleotide or polypeptide sequences. The "full length" may be prior to, or after, various 
stages of post-translation processing or splicing, including alternative splicing. 

"Biological sample" as used herein is a sample of biological tissue or fluid that 
contains nucleic acids or polypeptides, e.g., of a prostate cancer protein, polynucleotide or 
5 transcript. Such samples include, but are not limited to, tissue isolated from primates, e.g., 
humans, or rodents, e.g., mice, and rats. Biological samples may also include sections of 
tissues such as biopsy and autopsy samples, frozen sections taken for histology purposes, 
blood, plasma, serum, sputum, stool, tears, mucus, hair, skin, etc. Biological samples also 
include explants and primary and/or transformed cell cultures derived from patient tissues. A 

10 biological sample is typically obtained from a eukaryotic organism, most preferably a 

mammal such as a primate e.g., chimpanzee or human; cow; dog; cat; a rodent, e.g., guinea 
pig, rat, mouse; rabbit; or a bird; reptile; or fish. 

"Providing a biological sample" means to obtain a biological sample for use in 
methods described in this invention. Most often, this will be done by removing a sample of 

15 cells from an animal, but can also be accomplished by using previously isolated cells (e.g., 
isolated by another person, at another time, and/or for another purpose), by collecting a 
sample which contains a soluble polypeptide or nucleic acid derived from a prostate cell, or 
by performing the methods of the invention in vivo. Archival tissues, having treatment or 
outcome history, will be particularly useful. 

20 The terms "identical" or percent "identity," in the context of two or more nucleic 

acids or polypeptide sequences, refer to two or more sequences or subsequences that are the 
same or have a specified percentage of amino acid residues or nucleotides that are the same 
(i.e., about 60% identity, preferably 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 
96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and aligned 

25 for maximum correspondence over a comparison window or designated region) as measured 
using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters 
described below, or by manual alignment and visual inspection (see, e.g., NCBI web site 
http:/Avww.ncbi.nlm.nih.gov/BLAST/ or the like). Such sequences are then said to be 
"substantially identical." This definition also refers to, or may be applied to, the compliment 

30 of a test sequence. The definition also includes sequences that have deletions and/or 
additions, as well as those that have substitutions, as well as naturally occurring, e.g., 
polymorphic or allelic variants, and man-made variants. As described below, the preferred 

11 
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algorithms can account for gaps and the like. Preferably, identity exists over a region that is 
at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 
50-100 amino acids or nucleotides in length. 

For sequence comparison, typically one sequence acts as a reference sequence, to 
which test sequences are compared. When using a sequence comparison algorithm, test and 
reference sequences are entered into a computer, subsequence coordinates are designated, if 
necessary, and sequence algorithm program parameters are designated. Preferably, default 
program parameters can be used, or alternative parameters can be designated. The sequence 
comparison algorithm then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

A "comparison window", as used herein, includes reference to a segment of one of the 
number of contiguous positions selected from the group consisting typically of from 20 to 
600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence 
may be compared to a reference sequence of the same number of contiguous positions after 
the two sequences are optimally aligned. Methods of alignment of sequences for comparison 
are well-known in the art. Optimal alignment of sequences for comparison can be conducted, 
e.g., by the local homology algorithm of Smith and Waterman (1981) Appl. Math. 2:482, by 
the homology alignment algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 48:443- 
453, by the search for similarity method of Pearson and Lipman (1988) Proc. Nat'l. Acad. 
Sci. USA 85:2444-2448, by computerized implementations of these algorithms (GAP, 
BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics 
Computer Group, 575 Science Dr., Madison, WI), or by manual alignment and visual 
inspection (see, e.g., Ausubel, et al. (eds. 1995 and supplements) Current Protocols in 
Molecular Biology Lippincott). 

Preferred examples of algorithms that are suitable for determining percent sequence 
identity and sequence similarity include the BLAST and BLAST 2.0 algorithms, which are 
described in Altschul, et al. (1977) Nuc. Acids Res. 25:3389-3402 and Altschul, et al. (1990) 
J. MoL Biol. 215:403-410. BLAST and BLAST 2.0 are used, with the parameters described 
herein, to determine percent sequence identity for the nucleic acids and proteins of the 
invention. Software for performing BLAST analyses is publicly available through the 
National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This 
algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
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words of length W in the query sequence, which either match or satisfy some positive-valued 
threshold score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul, et al., supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing 
5 them. The word hits are extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calculated using, e.g., 
for nucleotide sequences, the parameters M (reward score for a pair of matching residues; 
always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid 
sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word 

10 hits in each direction are halted when: the cumulative alignment score falls off by the 

quantity X from its maximum achieved value; the cumulative score goes to zero or below, 
due to the accumulation of one or more negative-scoring residue alignments; or the end of 
either sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) 

15 uses as defaults a wordlength (W) of 1 1 , an expectation (E) of 10, M=5, N— 4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 
defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff and Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915-919) alignments (B) 
of 50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. 

20 The BLAST algorithm also performs a statistical analysis of the similarity between 

two sequences (see, e.g., Karlin and Altschul (1993) Proc. Nat'l. Acad. Sci. USA 90:5873- 
5787). One measure of similarity provided by the BLAST algorithm is the smallest sum 
probability (P(N)), which provides an indication of the probability by which a match between 
two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid 

25 is considered similar to a reference sequence if the smallest sum probability in a comparison 
of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably 
less than about 0.01, and most preferably less than about 0.001. Log values may be large 
negative numbers, e.g., 5, 10, 20, 30, 40, 40, 70, 90, 110, 150, 170, etc. 

An indication that two nucleic acid sequences or polypeptides are substantially 

30 identical is that the polypeptide encoded by the first nucleic acid is immunologically cross 
reactive with the antibodies raised against the polypeptide encoded by the second nucleic 
acid, as described below. Thus, a polypeptide is typically substantially identical to a second 
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polypeptide, e.g., where the two peptides differ only by conservative substitutions. Another 
indication that two nucleic acid sequences are substantially identical is that the two molecules 
or their complements hybridize to each other under stringent conditions, as described below. 
Yet another indication that two nucleic acid sequences are substantially identical is that the 
5 same primers can be used to amplify the sequences. 

A "host cell" is a naturally occurring cell or a transformed cell that contains an 
expression vector and supports the replication or expression of the expression vector. Host 
cells may be cultured cells, explants, cells in vivo, and the like. Host cells may be 
prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or 

10 mammalian cells such as CHO, HeLa, and the like (see, e.g., the American Type Culture 
Collection catalog or web site, www.atcc.org). 

The terms "isolated," "purified," or "biologically pure" refer to material that is 
substantially or essentially free from components that normally accompany it as found in its 
native state. Purity and homogeneity are typically determined using analytical chemistry 

15 techniques such as polyacrylamide gel electrophoresis or high performance liquid 

chromatography. A protein or nucleic acid that is the predominant species present in a 
preparation is substantially purified. In particular, an isolated nucleic acid is separated from 
some open reading frames that naturally flank the gene and encode proteins other than protein 
encoded by the gene. The term "purified" in some embodiments denotes that a nucleic acid 

20 or protein gives rise to essentially one band in an electrophoretic gel. Preferably, it means 
that the nucleic acid or protein is at least 85% pure, more preferably at least 95% pure, and 
most preferably at least 99% pure. "Purify" or "purification" in other embodiments means 
removing at least one contaminant from the composition to be purified. In this sense, 
purification does not require that the purified compound be homogenous, e.g., 100% pure. 

25 The terms "polypeptide," "peptide," and "protein" are used interchangeably herein to 

refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which 
one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally 
occurring amino acid, as well as to naturally occurring amino acid polymers, those containing 
modified residues, and non-naturally occurring amino acid polymer. Certain diagnostic 

30 methods may evaluate secreted or breakdown products present only because the producing 
cell is present, and would otherwise be absent in a normal individual. 
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The term "amino acid" refers to naturally occurring and synthetic amino acids, as well 
as amino acid analogs and amino acid mimetics that function similarly to the naturally 
occurring jaanino acids. Naturally occurring amino acids are those encoded by the genetic 
code, as well as those amino acids that are later modified, e.g., hydroxyproline, y- 
5 carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, e.g., an a carbon that is 
bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, 
norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs may have 
modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic 

10 chemical structure as a naturally occurring amino acid. Amino acid mimetics refers to 
chemical compounds that have a structure that is different from the general chemical 
structure of an amino acid, but that functions similarly to a naturally occurring amino acid. 

Amino acids may be referred to herein by either their commonly known three letter 
symbols or by the one-letter symbols recommended by the IUP AC-IUB Biochemical 

1 5 Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly 
accepted single-letter codes. 

"Conservatively modified variants" applies to both amino acid and nucleic acid 
sequences. With respect to particular nucleic acid sequences, conservatively modified 
variants refers to those nucleic acids which encode identical or essentially identical amino 

20 acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical or associated, e.g., naturally contiguous, sequences.. Because of the 
degeneracy of the genetic code, a large number of functionally identical nucleic acids encode 
most proteins. For instance, the codons GCA, GCC, GCG, and GCU encode the amino acid 
alanine. Thus, at every position where an alanine is specified by a codon, the codon can be 

25 altered to another of the corresponding codons described without altering the encoded 

polypeptide. Such nucleic acid variations are "silent variations," which are one species of 
conservatively modified variations. Every nucleic acid sequence herein which encodes a 
polypeptide also describes silent variations of the nucleic acid. One of skill will recognize 
that in certain contexts each codon in a nucleic acid (except AUG, which is ordinarily the 

30 only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can 
be modified to yield a functionally identical molecule. Accordingly, often silent variations of 
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a nucleic acid which encodes a polypeptide is implicit in a described sequence with respect to 
the expression product, but not with respect to actual probe sequences. 

As to amino acid sequences, one of skill will recognize that individual substitutions, 
deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which 
5 alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded 
sequence is a "conservatively modified variant" where the alteration results in the substitution 
of an amino acid with a chemically similar amino acid. Conservative substitutions providing 
functionally similar amino acids are well known in the art. Such conservatively modified 
variants are in addition to and do not exclude polymorphic variants, interspecies homologs, 

10 and alleles of the invention, typically conservative substitutions for one another: 1) Alanine 
(A), Glycine (G); 2) Aspartic acid (D), Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 
4) Argirnne (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6) 
Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S), Threonine (T); and 8) 
Cysteine (C), Methionine (M) (see, e.g., Creighton (1984) Proteins Freeman). 

15 Macromolecular structures such as polypeptide structures can be described in terms of 

various levels of organization. For a general discussion of this organization, see, e.g., 
Alberts, et al. (2001) Molecular Biology of the Cell (4th ed.) and Cantor and Schimmel 
(1980) Biophysical Chemistry Part I: The Conformation of Biological Macromolecules 
Freeman. "Primary structure" refers to the amino acid sequence of a particular peptide. 

20 "Secondary structure" refers to locally ordered, three dimensional structures within a 

polypeptide. These structures are commonly known as domains. Domains are portions of a 
polypeptide that often form a compact unit of the polypeptide and are typically 25 to 
approximately 500 amino acids long. Typical domains are made up of sections of lesser 
organization such as stretches of 0-sheet and a-helices. "Tertiary structure" refers to the 

25 complete three dimensional structure of a polypeptide monomer. "Quaternary structure" 
refers to the three dimensional structure formed, usually by the noncovalent association of 
independent tertiary units. Anisotropic terms are also known as energy terms. 

"Nucleic acid" or "oligonucleotide" or "polynucleotide" or grammatical equivalents 
used herein means at least two nucleotides covalently linked together. Oligonucleotides are 

30 typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more nucleotides in length, up 
to about 100 nucleotides in length. Nucleic acids and polynucleotides are a polymers of 
virtually any length, including longer lengths, e.g., 200, 300, 500, 1000, 2000, 3000, 5000, 
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7000, 10,000, etc. A nucleic acid of the present invention will generally contain 
phosphodiester bonds, although in some cases, nucleic acid analogs are included that may 
have alternate backbones, comprising, e.g., phosphoramidate, phosphorothioate, 
phosphorodithioate, or O-methylphosphoroamidite linkages (see Eckstein (1992) 
5 Oligonucleotides and Analogues: A Practical Approach. Oxford University Press); and 

peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with 
positive backbones; non-ionic backbones, and non-ribose backbones, including those 
described in U.S. Patent Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC 
Symposium Series 580, Sanghvi and Cook (eds. 1994) Carbohydrate Modifications in 

10 Antisense Research ACS Symposium Series 580. Nucleic acids containing one or more 

carbocyclic sugars are also included within one definition of nucleic acids. Modifications of 
the ribose-phosphate backbone may be done for a variety of reasons, e.g., to increase the 
stability and half-life of such molecules in physiological environments or as probes on a 
biochip. Mixtures of naturally occurring nucleic acids and analogs can be made; 

15 alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring 
nucleic acids and analogs may be made. 

A variety of references disclose such nucleic acid analogs, including, for example, 
phosphoramidate (Beaucage, et al. (1993) Tetrahedron 49(10):1925-1963 and references 
therein; Letsinger (1970) J. Org. Chem. 35:3800-3803; Sprinzl, et al. (1977) Eur. J. Biochem. 

20 81 :579-589; Letsinger, et al. (1986) Nucl. Acids Res. 14:3487-499; Sawai, et al (1984) 

Chem. Lett. 805, Letsinger, et al. (1988) J. Am. Chem. Soc. 1 10:4470-4471; and Pauwels, et 
al. (1986) Chemica Scripta 26:141-149), phosphorothioate (Mag, et al. (1991) Nucleic Acids 
Res. 19:1437-441; and U.S. Patent No. 5,644,048), phosphorodithioate (Briu, et al. (1989) J. 
Am. Chem. Soc. 1 1 l:2321-xxx, O-methylphosphoroamidite linkages (see Eckstein (1992) 

25 Oligonucleotides and Analogues: A Practical Approach Oxford University Press), and 
peptide nucleic acid backbones and linkages (see Egholm (1992) T. Am. C hem. Soc. 
114:1895-1897; Meier, et al. (1992) Chem. Int. Ed. Engl. 31:1008-1010: Nielsen (1993) 
Nature 365:566-568; Carlsson, et al. (1996) Nature 380:207, each of which is incorporated by 
reference). Other analog nucleic acids include those with positive backbones (Denpcy, et al. 

30 (1995) Proc. Natl. Acad. Sci. USA 92:6097-101; non-ionic backbones (U.S. Patent Nos. 

5,386,023, 5,637,684, 5,602,240, 5,216,141 and 4,469,863; Kiedrowshi, et al. (1991) Angew. 
Chem. Intl. Ed. English 30:423-426; Letsinger, et al. (1988) J. Am. Chem. Soc. 110:4470; 
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Letsinger, et al. (1994) Nucleoside and Nucleotide 13:1597-xxx; Chapters 2 and 3 in 
Sanghvi and Cook (eds. 1994) Carbohydrate Modifications in Antisense Research ACS 
Symposium Series 580; Mesmaeker, et al. (1994) Bioorganic and Medicinal Chem. Lett. 
4:395-xxx; Jeffs, et al. (1994) J. Biomolecular NMR 34:17; Horn (1996) Tetrahedron Lett. 
37:743-xxx) and non-ribose backbones, including those described in U.S. Patent Nos. 
5,235,033 and 5,034,506, and Chapters 6 and 7 in Sanghvi and Cook (eds. 1994) 
Carbohydrate Modifications in Antisense Research ACS Symposium Series 580. Nucleic 
acids containing one or more carbocyclic sugars are also included within one definition of 
nucleic acids (see Jenkins, et al. (1995) Chem. Soc. Rev, xx: 169- 176). Several nucleic acid 
analogs are described in Rawls (p. 35, June 2, 1997) C&E News . Each of these references is 
hereby expressly incorporated by reference. 

Particularly preferred are peptide nucleic acids (PNA) which includes peptide nucleic 
acid analogs. These backbones are substantially non-ionic under neutral conditions, in 
contrast to the highly charged phosphodiester backbone of naturally occurring nucleic acids. 
This results in two advantages. First, the PNA backbone exhibits improved hybridization 
kinetics. PNAs have larger changes in the melting temperature (T m ) for mismatched versus 
perfectly matched base pairs. DNA and RNA typically exhibit a 2-4° C drop in T m for an 
internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9° C. 
Similarly, due to their non-ionic nature, hybridization of the bases attached to these 
backbones is relatively insensitive to salt concentration. In addition, PNAs are not degraded 
by cellular enzymes, and thus can be more stable. 

The nucleic acids may be single stranded or double stranded, as specified, or contain 
portions of both double stranded or single stranded sequence. As will be appreciated by those 
in the art, the depiction of a single strand also defines the sequence of the complementary 
strand; thus the sequences described herein also provide the complement of the sequence. 
The nucleic acid may be DNA, both genomic and cDNA, RNA or a hybrid, where the nucleic 
acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations of 
bases, including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine hypoxanthine, 
isocytosine, isoguanine, etc. "Transcript" typically refers to a naturally occurring RNA, e.g., 
a pre-mRNA, hnRNA, or mRNA. As used herein, the term "nucleoside" includes nucleotides 
and nucleoside and nucleotide analogs, and modified nucleosides such as amino modified 
nucleosides. In addition, "nucleoside" includes non-naturally occurring analog structures. 
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Thus, e.g., the individual units of a peptide nucleic acid, each containing a base, are referred 
to herein as a nucleoside. 

A "label" or a "detectable moiety" is a composition detectable by spectroscopic, 
photochemical, biochemical, immunochemical, chemical, or other physical means. For 
5 example, useful labels include 32 P, fluorescent dyes, electron-dense reagents, enzymes (e.g., 
as commonly used in an ELIS A), biotin, digoxigenin, or haptens and proteins or other entities 
which can be made detectable, e.g., by incorporating a radiolabel into the peptide or used to 
detect antibodies specifically reactive with the peptide. The labels may be incorporated into 
the prostate cancer nucleic acids, proteins, and antibodies at virtually any position. Many 

10 methods for conjugating the antibody to the label may be employed, including those methods 
described by Hunter, et al. (1962) Nature, 144:945; David, et al. (1974) Biochemistry 
13:1014-1021; Pain, etal. (1981) J. Immunol. Meth. 40:219-230; and Nygren (1982) J. 
Histochem. and Cvtochem. 30:407-412. 

An "effector" or "effector moiety" or "effector component" is a molecule that is 

15 bound (or linked, or conjugated), either covalently, through a linker or a chemical bond, or 
noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds, to an antibody. 
The "effector" can be a variety of molecules including, e.g., detection moieties including 
radioactive compounds, fluorescent compounds, an enzyme or substrate, tags such as epitope 
tags, a toxin; activatable moieties, a chemotherapeutic agent; a lipase; an antibiotic; or a 

20 radioisotope emitting "hard" e.g., beta radiation. 

A "labeled nucleic acid probe or oligonucleotide" is one that is bound, either 
covalently, through a linker or a chemical bond, or noncovalently, through ionic, van der 
Waals, electrostatic, or hydrogen bonds to a label such that the presence of the probe may be 
detected by detecting the presence of the label bound to the probe. Alternatively, method 

25 using high affinity interactions may achieve the same results where one of a pair of binding 
partners binds to the other, e.g., biotin, strep tavidin. 

As used herein a "nucleic acid probe or oligonucleotide" is defined as a nucleic acid 
capable of binding to a target nucleic acid of complementary sequence through one or more 
types of chemical bonds, usually through complementary base pairing, usually through 

30 hydrogen bond formation. As used herein, a probe may include natural (i.e., A, G, C, or T) or 
modified bases (7-deazaguanosine, inosine, etc.). In addition, the bases in a probe may be 
joined by a linkage other than a phosphodiester bond, so long as it does not functionally 
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interfere with hybridization. Thus, e.g., probes may be peptide nucleic acids in which the 
constituent bases are joined by peptide bonds rather than phosphodiester linkages. It will be 
understood by one of skill in the art that probes may bind target sequences lacking complete 
complementarity with the probe sequence depending upon the stringency of the hybridization 
5 conditions. The probes are preferably directly labeled as with isotopes, chromophores, 
lumiphores, chromogens, or indirectly labeled such as with biotin to which a streptavidin 
complex may later bind. By assaying for the presence or absence of the probe, one can detect 
the presence or absence of the select sequence or subsequence. Diagnosis or prognosis may 
be based at the genomic level, or at the level of RNA or protein expression. 

10 The term "recombinant" when used with reference, e.g., to a cell, or nucleic acid, 

protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been modified by 
the introduction of a heterologous nucleic acid or protein or the alteration of a native nucleic 
acid or protein, or that the cell is derived from a cell so modified. Thus, e.g., recombinant 
cells express genes that are not found within the native (non-recombinant) form of the cell or 

1 5 express native genes that are otherwise abnormally expressed, under expressed or not 
expressed at all. By the term "recombinant nucleic acid" herein is meant nucleic acid, 
originally formed in vitro, in general, by the manipulation of nucleic acid, e.g., using 
polymerases and endonucleases, in a form not normally found in nature. In this manner, 
operably linkage of different sequences is achieved. Thus an isolated nucleic acid, in a linear 

20 form, or an expression vector formed in vitro by ligating DNA molecules that are not 

normally joined, are both considered recombinant for the purposes of this invention. It is 
understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will replicate non-recombinantly, i.e., using the in vivo cellular machinery of the 
host cell rather than in vitro manipulations; however, such nucleic acids, once produced 

25 recombinantly, although subsequently replicated non-recombinantly, are still considered 

recombinant for the purposes of the invention. Similarly, a "recombinant protein" is a protein 
made using recombinant techniques, i.e., through the expression of a recombinant nucleic 
acid as depicted above. 

The term "heterologous" when used with reference to portions of a nucleic acid 

30 indicates that the nucleic acid comprises two or more subsequences that are not normally 
found in the same relationship to each other in nature. For instance, the nucleic acid is 
typically recombinantly produced, having two or more sequences, e.g., from unrelated genes 
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arranged to make a new functional nucleic acid, e.g., a promoter from one source and a 
coding region from another source. Similarly, a heterologous protein will often refer to two 
or more subsequences that are not found in the same relationship to each other in nature (e.g., 
a fusion protein). 

5 A "promoter" is defined as an array of nucleic acid control sequences that direct 

transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic acid 
sequences near the start site of transcription, such as, in the case of a polymerase II type 
promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor 
elements, which can be located as much as several thousand base pairs from the start site of 

10 transcription. A "constitutive" promoter is a promoter that is active under most 

environmental and developmental conditions. An "inducible" promoter is a promoter that is 
active under environmental or developmental regulation. The term "operably linked" refers 
to a functional linkage between a nucleic acid expression control sequence (such as a 
promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, 

15 wherein the expression control sequence directs transcription of the nucleic acid 
corresponding to the second sequence. 

An "expression vector" is a nucleic acid construct, generated recombinantly or 
synthetically, with a series of specified nucleic acid elements that permit transcription of a 
particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or 

20 nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be 
transcribed operably linked to a promoter. 

The phrase "selectively (or specifically) hybridizes to" refers to the binding, 
duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under 
stringent hybridization conditions when that sequence is present in a complex mixture (e.g., 

25 total cellular or library DNA or RNA). 

The phrase "stringent hybridization conditions" refers to conditions under which a 
probe will hybridize to its target subsequence, typically in a complex mixture of nucleic 
acids, but to no other sequences. Stringent conditions are sequence-dependent and will be 
different in different circumstances. Longer sequences hybridize specifically at higher 

30 temperatures. An extensive guide to the hybridization of nucleic acids is found "Overview of 
principles of hybridization and the strategy of nucleic acid assays" in Tijssen (1993) 
Hybridization with Nucleic Probes ( Techniques in Biochemistry and Molecular Biology vol. 
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24) Elsevier. Generally, stringent conditions are selected to be about 5-10° C lower than the 
thermal melting point (Tm) for the specific sequence at a defined ionic strength pH. The T m is 
the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% 
of the probes complementary to the target hybridize to the target sequence at equilibrium (as 
5 the target sequences are present in excess, at T m , 50% of the probes are occupied at 

equilibrium). Stringent conditions will be those in which the salt concentration is less than 
about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or other 
salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C for short probes (e.g., 10 to 
50 nucleotides) and at least about 60° C for long probes (e.g., greater than 50 nucleotides). 

10 Stringent conditions may also be achieved with the addition of destabilizing agents such as 
fonnamide. For selective or specific hybridization, a positive signal is at least two times 
background, preferably 10 times background hybridization. Exemplary stringent 
hybridization conditions can be as following: 50% fonnamide, 5x SSC, and 1% SDS, 
incubating at 42° C, or, 5x SSC, 1 % SDS, incubating at 65° C, with wash in 0.2x SSC, and 

15 0.1% SDS at 65° C. For PCR, a temperature of about 36° C is typical for low stringency 
amplification, although annealing temperatures may vary between about 32° C and 48° C 
depending on primer length. For high stringency PCR amplification, a temperature of about 
62° C is typical, although high stringency annealing temperatures can range from about 50- 
65° C, depending on the primer length and specificity. Typical cycle conditions for both high 

20 and low stringency amplifications include a denaturation phase of 90-95° C for 30-120 sec, 
an annealing phase lasting 30-120 sec, and an extension phase of about 72° C for 1-2 min. 
Protocols and guidelines for low and high stringency amplification reactions are provided, 
e.g., in Trmis, et al. (1990) PCR Protocols: A Guide to Methods and Applications Academic 
Press, N.Y. 

25 Nucleic acids that do not hybridize to each other under stringent conditions are still 

substantially identical if the polypeptides which they encode are substantially identical. This 
occurs, e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy 
permitted by the genetic code. In such cases, the nucleic acids typically hybridize under 
moderately stringent hybridization conditions. Exemplary "moderately stringent 

30 hybridization conditions" include a hybridization in a buffer of 40% fonnamide, 1 M NaCl, 
1% SDS at 37° C, and a wash in IX SSC at 45° C. A positive hybridization is at least twice 
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background. Those of ordinary skill will readily recognize that alternative hybridization and 
wash conditions can be utilized to provide conditions of similar stringency. Additional 
guidelines for determining hybridization parameters are provided in numerous references, 
e.g., Ausubel, et al. (eds. 1991 and supplements) Current Protocols in Molecular Biology 

5 The phrase 'functional effects" in the context of assays for testing compounds that 

modulate activity of a prostate cancer protein includes the determination of a parameter that 
is indirectly or directly under the influence of the prostate cancer protein or nucleic acid, e.g., 
a functional, physical, or chemical effect, such as the ability to decrease prostate proliferation 
(malignant or non-malignant). It includes ligand binding activity; cell growth on soft agar; 

10 anchorage dependence; contact inhibition and density limitation of growth; cellular 

proliferation; cellular transformation; growth factor or serum dependence; tumor specific 
marker levels; invasiveness into Matrigel; tumor growth and metastasis in vivo; mRNA and 
protein expression in cells undergoing metastasis, and other characteristics of prostate cancer 
cells. "Functional effects" include in vitro, in vivo, and ex vivo activities. 

15 By "determining the functional effect" is meant assaying for a compound that 

increases or decreases a parameter that is indirectly or directly under the influence of a 
prostate cancer protein sequence, e.g., functional, enzymatic, physical and chemical effects. 
Such functional effects can be measured by means known to those skilled in the art, e.g., 
changes in spectroscopic characteristics (e.g., fluorescence, absorbance, refractive index), 

20 hydrodynamic (e.g., shape), chromatographic, or solubility properties for the protein, 
measuring inducible markers or transcriptional activation of the prostate cancer protein; 
measuring binding activity or binding assays, e.g., binding to antibodies or other ligands, and 
measuring cellular proliferation. Determination of the functional effect of a compound on 
prostate cancer can also be performed using prostate cancer assays known to those of skill in 

25 the art such as an in vitro assays, e.g., cell growth on soft agar; anchorage dependence; 
contact inhibition and density limitation of growth; cellular proliferation; cellular 
transformation; growth factor or serum dependence; tumor specific marker levels; 
invasiveness into Matrigel; tumor growth and metastasis in vivo; mRNA and protein 
expression in cells undergoing metastasis, and other characteristics of prostate cancer cells. 

30 The functional effects can be evaluated by many means known to those skilled in the art, e.g., 
microscopy for quantitative or qualitative measures of alterations in morphological features, 
measurement of changes in RNA or protein levels for prostate cancer-associated sequences, 
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measurement of RNA stability, identification of downstream or reporter gene expression 
(CAT, hiciferase, (3-gal, GFP, and the like), e.g., via chemiluminescence, fluorescence, 
colorimetric reactions, antibody binding, inducible markers, and ligand binding assays. 

"Inhibitors", "activators", and "modulators" of prostate cancer polynucleotide and 
5 polypeptide sequences are used to refer to activating, inhibitory, or modulating molecules or 
compounds identified using in vitro and in vivo assays of prostate cancer polynucleotide and 
polypeptide sequences. Inhibitors .are compounds that, e.g., bind to, partially or totally block 
activity, decrease, prevent, delay activation, inactivate, desensitize, or down regulate the 
activity or expression of prostate cancer proteins, e.g., antagonists. Antisense nucleic acids 

10 may seem to inhibit expression and subsequent function of the protein. "Activators" are 

compounds that increase, open, activate, facilitate, enhance activation, sensitize, agonize, or 
up regulate prostate cancer protein activity. Inhibitors, activators, or modulators also include 
genetically modified versions of prostate cancer proteins, e.g., versions with altered activity, 
as well as naturally occurring and synthetic ligands, antagonists, agonists, antibodies, small 

15 chemical molecules and the like. Such assays for inhibitors and activators include, e.g., 

expressing the prostate cancer protein in vitro, in cells, or cell membranes, applying putative 
modulator compounds, and then determining the functional effects on activity, as described 
above. Activators and inhibitors of prostate cancer can also be identified by incubating 
prostate cancer cells with the test compound and determining increases or decreases in the 

20 expression of 1 or more prostate cancer proteins, e.g., 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50 
or more prostate cancer proteins, such as prostate cancer proteins encoded by the sequences 
set out in Tables 1 A-4. 

Samples or assays comprising prostate cancer proteins that are treated with a potential 
activator, inhibitor, or modulator are compared to control samples without the inhibitor, 

25 activator, or modulator to examine the extent of inhibition. Control samples (untreated with 
inhibitors) are assigned a relative protein activity value of 100%. Inhibition of a polypeptide 
is achieved when the activity value relative to the control is about 80%, preferably 50%, more 
preferably 25-0%. Activation of a prostate cancer polypeptide is achieved when the activity 
value relative to the control (untreated with activators) is 1 10%, more preferably 150%, more 

30 preferably 200-500% (i.e., two to five fold higher relative to the control), more preferably 
1000-3000% higher. 
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The phrase "changes in cell growth" refers to a change in cell growth and 
proliferation characteristics in vitro or in vivo, such as cell viability, formation of foci, 
anchorage independence, semi-solid or soft agar growth, changes in contact inhibition and 
density limitation of growth, loss of growth factor or serum requirements, changes in cell 
5 morphology, gaining or losing immortalization, gaining or losing tumor specific markers, 
ability to form or suppress tumors when injected into suitable animal hosts, and/or 
immortalization of the cell. See, e.g., pp. 231-241 in Freshney (1994) Culture of Animal 
Cells: A Manual of Bagic Technique (3d ed.) Wiley-Liss. 

"Tumor cell" refers to precancerous, cancerous, and/or normal cells in a tumor. 

10 "Cancer cells," "transformed" cells, or ''transformation" in tissue culture, refers to 

spontaneous or induced phenotypic changes that do not necessarily involve the uptake of new 
genetic material. Although transformation can arise from infection with a transforming virus 
and incorporation of new genomic DNA, or uptake of exogenous DNA, it can also arise 
spontaneously or following exposure to a carcinogen, thereby mutating an endogenous gene. 

15 Transformation is associated with phenotypic changes, such as immortalization of cells, 
aberrant growth control, nonmorphological changes, and/or malignancy. See, Freshney 
(2001) Culture of Animal Cells: A Manual of Basic Technique (4th ed.) Wiley-Liss. 

"Antibody" refers to a polypeptide comprising a framework region from an 
immunoglobulin gene or fragments thereof that specifically binds and recognizes an antigen. 

20 The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, 

epsilon, and mu constant region genes, as well as the myriad immunoglobulin variable region 
genes. Light chains are classified as either kappa or lambda. Heavy chains are classified as 
gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, 
IgM, IgA, IgD, and IgE, respectively. Typically, the antigen-binding region of an antibody or 

25 its functional equivalent will be most critical in specificity and affinity of binding. See Paul 
(ed. 1999) Fundamen tal TTnrnimolnp y (4th ed.) Raven. 

An exemplary immunoglobulin (antibody) structural unit comprises a tetramer. Each 
tetramer is composed of two identical pairs of polypeptide chains, each pair having one 
"light" (about 25 kD) and one 'lieavy" chain (about 50-70 kD). The N-terminus of each 

30 chain defines a variable region of about 100 to 1 10 or more amino acids primarily responsible 
for antigen recognition. The terms variable light chain (Vl) and variable heavy chain (Vh) 
refer to these light and heavy chains respectively. 
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Antibodies exist, e.g., as intact immunoglobulins or as a number of well-characterized 
fragments produced by digestion with various peptidases. Thus, e.g., pepsin digests an 
antibody below the disulfide linkages in the hinge region to produce F(ab)' 2 , a dimer of Fab 
which itself is a light chain joined to V H -C H 1 by a disulfide bond. The F(ab)' 2 may be 
5 reduced under mild conditions to break the disulfide linkage in the hinge region, thereby 
converting the F(ab)'2 dimer into an Fab' monomer. The Fab' monomer is essentially Fab 
with part of the hinge region (see Paul (ed. 1993) Fundament al Immunolog y (3d ed.) Raven. 
While various antibody fragments are defined in terms of the digestion of an intact antibody, 
one of skill will appreciate that such fragments may be synthesized de novo either chemically 
10 or by using recombinant DNA methodology. Thus, the term antibody, as used herein, also 
includes antibody fragments either produced by the modification of whole antibodies, or 
those synthesized de novo using recombinant DNA methodologies (e.g., single chain Fv) or 
those identified using phage display libraries (see, e.g., McCafferty, et al.(1990) Nature 
348:552-554. 

1 5 For preparation of antibodies, e.g., recombinant, monoclonal, or polyclonal 

antibodies, many technique known in the art can be used (see, e.g., Kohler and Milstein 
(1975) Nature 256:495-497; Kozbor, et al. (1983) Immunology Today 4:72; pp. 77-96 in 
Cole, et al. (1985) Monoclonal Antibodies and Cancer Therapy Liss; Coligan (1991) Current 
Protocols in Immunology Lippincott; Harlow and Lane (1988) Antibodies: A Laboratory 

20 Manual CSH Press; and Goding (1986) Monoclonal Antibodies: Principles and Practice (2d 
ed.) Academic Press. Techniques for the production of single chain antibodies (U.S. Patent 
4,946,778) can be adapted to produce antibodies to polypeptides of this invention. Also, 
transgenic mice, or other organisms such as other mammals, may be used to express 
humanized antibodies. Alternatively, phage display technology can be used to identify 

25 antibodies and heteromeric Fab fragments that specifically bind to selected antigens (see, e.g., 
McCafferty, et al. (1990) Nature 348:552-554; Marks, et al. (1992) Biotechnology 10:779- 
783). 

A "chimeric antibody" is an antibody molecule in which (a) the constant region, or a 
portion thereof, is altered, replaced or exchanged so that the antigen binding site (variable 
30 region) is linked to a constant region of a different or altered class, effector function and/or 
species, or an entirely different molecule which confers new properties to the chimeric 
antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the variable 
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region, or a portion thereof, is altered, replaced or exchanged with a variable region having a 
different or altered antigen specificity. 

Identification of prostate cancer- associated sequences 
5 In one aspect, the expression levels of genes are determined in different patient 

samples for which diagnosis information is desired, to provide expression profiles. An 
expression profile of a particular sample is essentially a "fingerprint" of the state of the 
sample; while two states may have a particular gene similarly expressed, the evaluation of a 
number of genes simultaneously allows the generation of a gene expression profile that is 

10 characteristic of the state of the cell. That is, normal tissue (e.g., normal prostate or other 
tissue) may be distinguished from pathological prostate cells, e.g., cancerous or metastatic 
cancerous tissue of the prostate, or prostate cancer tissue or metastatic prostate cancerous 
tissue can be compared with tissue samples of prostate and other tissues from surviving 
cancer patients. By comparing expression profiles of tissue in known different prostate 

1 5 cancer states, information regarding which genes are important (including both up- and 
down-regulation of genes) in each of these states is obtained. 

The identification of sequences that are differentially expressed in prostate cancer 
versus non-prostate cancer tissue allows the use of this information in a number of ways. For 
example, a particular treatment regime may be evaluated: does a chemotherapeutic drug act 

20 to down-regulate prostate cancer or other proliferative disorders, and thus tumor growth or 
recurrence, in a particular patient. Alternatively, a treatment step may induce other markers 
which may be used as targets to destroy tumor cells. Similarly, diagnosis and treatment 
outcomes may be done or confirmed by comparing patient samples with the known 
expression profiles. Maliganant disease may be compared to non-malignant conditions. 

25 Metastatic tissue can also be analyzed to determine the stage of prostate cancer in the tissue, 
or origin of primary tumor, e.g., metastasis from a remote primary site. Furthermore, these 
gene expression profiles (or individual genes) allow screening of drug candidates with an eye 
to mimicking or altering a particular expression profile; e.g., screening can be done for drugs 
that suppress the prostate cancer expression profile. This may be done by making biochips 

30 comprising sets of the important prostate cancer genes, which can then be used in these 
screens. These methods can also be done on the protein basis; that is, protein expression 
levels of the prostate cancer proteins can be evaluated for diagnostic purposes or to screen 
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candidate agents. In addition, the prostate cancer nucleic acid sequences can be administered 
for gene therapy purposes, including the administration of antisense nucleic acids, or the 
prostate cancer proteins (including antibodies and other modulators thereof) administered as 
therapeutic drugs. 

5 Thus the present invention provides nucleic acid and protein sequences that are 

differentially expressed in prostate cancer relative to normal tissues and/or non-malignant 
disease, or in different types of related diseases, herein termed "prostate cancer sequences." 
As outlined below, prostate cancer sequences include those that are up-regulated (i.e., 
expressed at a higher level) in prostate cancer, as well as those that are down-regulated (i.e., 

10 expressed at a lower level). In a preferred embodiment, the prostate cancer sequences are 
from humans; however, as will be appreciated by those in the art, prostate cancer sequences 
from other organisms may be useful in animal models of disease and drug evaluation; thus, 
other prostate cancer sequences are provided, from vertebrates, including mammals, 
including rodents (rats, mice, hamsters, guinea pigs, etc.), primates, farm animals (including 

15 sheep, goats, pigs, cows, horses, etc.) and pets, e.g., (dogs, cats, etc.). Prostate cancer 
sequences from other organisms may be obtained using the techniques outlined below. 

Prostate cancer sequences can include both nucleic acid and amino acid sequences. 
As will be appreciated by those in the art and is more fully outlined below, prostate cancer 
nucleic acid sequences are useful in a variety of applications, including diagnostic 

20 applications, which will detect naturally occurring nucleic acids, as well as screening 

applications; e.g., biochips comprising nucleic acid probes or PCR microtiter plates with 
selected probes to the prostate cancer sequences can be generated. 

A prostate cancer sequence can be initially identified by substantial nucleic acid 
and/or amino acid sequence homology to the prostate cancer sequences outlined herein. Such 

25 homology can be based upon the overall nucleic acid or amino acid sequence, and is 

generally determined as outlined below, using either homology programs or hybridization 
conditions. 

For identifying prostate cancer-associated sequences, the prostate cancer screen 
typically includes comparing genes identified in different tissues, e.g., normal and cancerous 
30 tissues, or tumor tissue samples from patients who have metastatic disease vs. non metastatic 
tissue. Other suitable tissue comparisons include comparing prostate cancer samples with 
metastatic cancer samples from other cancers, such as lung, breast, gastrointestinal cancers, 
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ovarian, etc. Samples of different stages of prostate cancer, e.g., survivor tissue, drug 
resistant states, and tissue undergoing metastasis, are applied to biochips comprising nucleic 
acid probes. The samples are first microdissected, if applicable, and treated as is known in 
the art for the preparation of mRNA. Suitable biochips are commercially available, e.g., from 
5 Affymetrix. Gene expression profiles are generated and the data analyzed. 

In one embodiment, the genes showing changes in expression as between normal and 
disease states are compared to genes expressed in other normal tissues, preferably normal 
prostate, but also including, and not limited to lung, heart, brain, liver, breast, kidney, muscle, 
colon, small intestine, large intestine, spleen, bone, and placenta. In a preferred embodiment, 

1 0 those genes identified during the prostate cancer screen that are expressed in a significant 
amount in other tissues are removed from the profile, although in some embodiments, this is 
not necessary. That is, when screening for drugs, it is usually preferable that the target be 
disease specific, to minimize possible side effects on other organs were there expression. 

In a preferred embodiment, prostate cancer sequences are those that are up-regulated 

15 in prostate cancer or related conditions; that is, the expression of these genes is higher in the 
prostate cancer tissue as compared to non-cancerous tissue. "Up-regulation" as used herein 
often means at least about a two-fold change, preferably at least about a three fold change, 
with at least about five-fold or higher being preferred. Another embodiment is directed to 
sequences up-regulated in non-malignant conditions relative to normal. 

20 Unigene cluster identification numbers and accession numbers herein are for the 

GenBank sequence database and the sequences of the accession numbers are hereby 
expressly incorporated by reference. GenBank is known in the art, see, e.g., Benson, et al. 
(19981 Nucleic Acids Research 26:1-7 and http://www.ncbi.iilm.ruh.gov/. Sequences are also 
available in other databases, e.g., European Molecular Biology Laboratory (EMBL) and 

25 DNA Database of Japan (DDBJ). U.S. Patent Application N. 09/687,576 and 09/976,858 (- 
001-3) further disclose related sequences, compositions, and methods of diagnosis and 
treatment of prostate cancer and related conditions and are hereby expressly incorporated by 
reference. 

In another preferred embodiment, prostate cancer sequences are those that are down- 
30 regulated in the prostate cancer; that is, the expression of these genes is lower in prostate 

cancer tissue as compared to non-cancerous tissue. "Down-regulation" as used herein often 
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means at least about a two-fold change, preferably at least about a three fold change, with at 
least about five-fold or higher being preferred. 

Informatics 

5 The ability to identify genes that are over or under expressed in prostate cancer can 

additionally provide high-resolution, high-sensitivity datasets which can be used in the areas 
of diagnostics, therapeutics, drug development, pharmacogenetics, protein structure, 
biosensor development, and other related areas. For example, the expression profiles can be 
used in diagnostic or prognostic evaluation of patients with prostate cancer. Or as another 

10 example, subcellular toxicological information can be generated to better direct drug structure 
and activity correlation (see Anderson, Pharmaceutical Proteomics: Targets, Mechanism, and 
Function, paper presented at the IBC Proteomics conference, Coronado, CA (June 11-12, 
1998)). Subcellular toxicological information can also be utilized in a biological sensor 
device to predict the likely toxicological effect of chemical exposures and likely tolerable 

15 exposure thresholds (see U.S. Patent No. 5,81 1,231). Similar advantages accrue from 

datasets relevant to other biomolecules and bioactive agents (e.g., nucleic acids, saccharides, 
lipids, drugs, and the like). 

Thus, in another embodiment, the present invention provides a database that includes 
at least one set of assay data- The data contained in the database is acquired, e.g., using array 

20 analysis either singly or in a library format. The database can be in a form in which data can 
be maintained and transmitted, but is preferably an electronic database. The electronic 
database of the invention can be maintained on an electronic device allowing for the storage 
of and access to the database, such as a personal computer, but is preferably distributed on a 
wide area network, such as the World Wide Web. 

25 The focus of the present section on databases that include peptide sequence data is for 

clarity of illustration only. It will be apparent to those of skill in the art that similar databases 
can be assembled for assay data acquired using an assay of the invention. 

The compositions and methods for identifying and/or quantitating the relative and/or 
absolute abundance of a variety of molecular and macromolecular species from a biological 

30 sample undergoing prostate cancer, i.e., the identification of prostate cancer-associated 

sequences described herein, provide an abundance of information, which can be correlated 
with pathological conditions, predisposition to disease, drug testing, therapeutic monitoring, 
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gene-disease causal linkages, identification of correlates of immunity and physiological 
status, among others. Although the data generated from the assays of the invention is suited 
for manual review and analysis, in a preferred embodiment, prior data processing using high- 
speed computers is utilized. 
5 An array of methods for indexing and retrieving biomolecular information is known 

in the art. For example, U.S. Patents 6,023,659 and 5,966,712 disclose a relational database 
system for storing biomolecular sequence information in a manner that allows sequences to 
be catalogued and searched according to one or more protein function hierarchies. U.S. 
Patent 5,953,727 discloses a relational database having sequence records containing 

10 information in a format that allows a collection of partial-length DNA sequences to be 

catalogued and searched according to association with one or more sequencing projects for 
obtaining full-length sequences from the collection of partial length sequences. U.S. Patent 
5,706,498 discloses a gene database retrieval system for making a retrieval of a gene 
sequence similar to a sequence data item in a gene database based on the degree of similarity 

15 between a key sequence and a target sequence. U.S. Patent 5,538,897 discloses a method 

using mass spectroscopy fragmentation patterns of peptides to identify amino acid sequences 
in computer databases by comparison of predicted mass spectra with experimentally-derived 
mass spectra using a closeness-of-fit measure. U.S. Patent 5,926,818 discloses a multi- 
dimensional database comprising a functionality for multi-dimensional data analysis 

20 described as on-line analytical processing (OLAP), which entails the consolidation of 

projected and actual data according to more than one consolidation path or dimension. U.S. 
Patent 5,295,261 reports a hybrid database structure in which the fields of each database 
record are divided into two classes, navigational and informational data, with navigational 
fields stored in a hierarchical topological map which can be viewed as a tree structure or as 

25 the merger of two or more such tree structures. 

See also Mount, et aL (2001) Bioinfonnatics CSH Press; Durbin, et al. (eds. 1999) 
Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids 
Cambridge Univ. Press; Baxevanis and Oeullette (eds., 1998) Bioinformatics: A Practical 
Guide to the Analysis of Genes and Proteins Wiley-Liss; Rashidi and Buehler (1999) 

30 Bioinformatics: Basic Applications in Biological Science and Medicine CRC Press; Setubal, 
et al. (eds. 1997) Introduction to Computational Molecular Biology Brooks/Cole; Misener 
and Krawetz (eds. 2000) Rioinfnr matics: Methods and Protocols Human Press; Higgins and 
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Taylor (eds. 2000) Bioinformatics: Sequence, Structure, and Databanks: A Practical 
Approach Oxford Univ. Press; Brown (2001) Bioinformatics: A Biologist's Guide to 
Biocomputing and the Internet Eaton Pub; Han and Kamber (2000) Data Mining: Concepts 
and Techniques Kaufinann Pub.; and Waterman (1995) Introduction to Computational 
5 Biology: Maps, Sequences, and Genomes Chap and Hall. 

The present invention provides a computer database comprising a computer and 
software for storing in computer-retrievable form assay data records cross-tabulated, e.g., 
with data specifying the source of the target-containing sample from which each sequence 
specificity record was obtained. 

10 In an exemplary embodiment, at least one of the sources of target-containing sample 

is from a control tissue sample known to be free of pathological disorders. In a variation, at 
least one of the sources is a known pathological tissue specimen, e.g., a neoplastic lesion or 
another tissue specimen to be analyzed for prostate cancer. In another variation, the assay 
records cross-tabulate one or more of the following parameters for each target species in a 

15 sample: (1) a unique identification code, which can include, e.g., a target molecular structure 
and/or characteristic separation coordinate (e.g., electrophoretic coordinates); (2) sample 
source; and (3) absolute and/or relative quantity of the target species present in the sample. 

The invention also provides for the storage and retrieval of a collection of target data 
in a computer data storage apparatus, which can include magnetic disks, optical disks, 

20 magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, magnetic 
bubble memory devices, and other data storage devices, including CPU registers and on-CPU 
data storage arrays. Typically, the target data records are stored as a bit pattern in an array of 
magnetic domains on a magnetizable medium or as an array of charge states or transistor gate 
states, such as an array of cells in a DRAM device (e.g., each cell comprised of a transistor 

25 and a charge storage area, which may be on the transistor). In one embodiment, the invention 
provides such storage devices, and computer systems built therewith, comprising a bit pattern 
encoding a protein expression fingerprint record comprising unique identifiers for at least 10 
target data records cross-tabulated with target source. 

When the target is a peptide or nucleic acid, the invention preferably provides a 

30 method for identifying related peptide or nucleic acid sequences, comprising performing a 
computerized comparison between a peptide or nucleic acid sequence assay record stored in 
or retrieved from a computer storage device or database and at least one other sequence. The 
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comparison can include a sequence analysis or comparison algorithm or computer program 
embodiment thereof (e.g., FASTA, TFASTA, GAP, BESTFIT) and/or the comparison may 
be of the relative amount of a peptide or nucleic acid sequence in a pool of sequences 
determined from a polypeptide or nucleic acid sample of a specimen. 
5 The invention also preferably provides a magnetic disk, such as an IBM-compatible 

(DOS, Windows, Windows95/98/2000, Windows NT, OS/2) or other format (e.g., Linux, 
SunOS, Solaris, ADC, SCO Unix, VMS, MV, Macintosh, etc.) floppy diskette or hard (fixed, 
Winchester) disk drive, comprising a bit pattern encoding data from an assay of the invention 
in a file format suitable for retrieval and processing in a computerized sequence analysis, 

10 comparison, or relative quantitation method. 

The invention also provides a network, comprising a plurality of computing devices 
linked via a data link, such as an Ethernet cable (coax or lOBaseT), telephone line, ISDN 
line, wireless network, optical fiber, or other suitable signal transmission medium, whereby at 
least one network device (e.g., computer, disk array, etc.) comprises a pattern of magnetic 

15 domains (e.g., magnetic disk) and/or charge domains (e.g., an array of DRAM cells) 
composing a bit pattern encoding data acquired from an assay of the invention. 

The invention also provides a method for transmitting assay data that includes 
generating an electronic signal on an electronic communications device, such as a modem, 
ISDN terminal adapter, DSL, cable modem, ATM switch, or the like, wherein the signal 

20 includes (in native or encrypted format) a bit pattern encoding data from an assay or a 
database comprising a plurality of assay results obtained by the method of the invention. 

In a preferred embodiment, the invention provides a computer system for comparing a 
query target to a database containing an array of data structures, such as an assay result 
obtained by the method of the invention, and ranking database targets based on the degree of 

25 identity and gap weight to the target data. A central processor is preferably initialized to load 
and execute the computer program for alignment and/or comparison of the assay results. 
. Data for a query target is entered into the central processor via an I/O device. Execution of 
the computer program results in the central processor retrieving the assay data from the data 
file, which comprises a binary description of an assay result. 

30 The target data or record and the computer program can be transferred to secondary 

memory, which is typically random access memory (e.g., DRAM, SRAM, SGRAM, or 
SDRAM). Targets are ranked according to the degree of correspondence between a selected 
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assay characteristic (e.g., binding to a selected affinity moiety) and the same characteristic of 
the query target and results are output via an I/O device. For example, a central processor 
can be a conventional computer (e.g., Intel Pentium, PowerPC, Alpha, PA-8000, SPARC, 
MPS 4400, MIPS 10000, VAX, etc.); a program can be a commercial or public domain 
5 molecular biology software package (e.g., UWGCG Sequence Analysis Software, Darwin); a 
data file can be an optical or magnetic disk, a data server, a memory device (e.g., DRAM, 
SRAM, SGRAM, SDRAM, EPROM, bubble memory, flash memory, etc.); an I/O device can 
be a terminal comprising a video display and a keyboard, a modem, an ISDN terminal 
adapter, an Ethernet port, a punched card reader, a magnetic strip reader, or other suitable I/O 
10 device. 

The invention also preferably provides the use of a computer system, such as that 
described above, which comprises: (1) a computer; (2) a stored bit pattern encoding a 
collection of peptide sequence specificity records obtained by the methods of the invention, 
which may be stored in the computer; (3) a comparison target, such as a query target; and (4) 
15 a program for alignment and comparison, typically with rank-ordering of comparison results 
on the basis of computed similarity values. 

Characteristics of prostate cancer-associated proteins 

Prostate cancer proteins of the present invention may be classified as secreted 

20 proteins, transmembrane proteins, or intracellular proteins. In one embodiment, the prostate 
cancer protein is an intracellular protein. Intracellular proteins may be found in the 
cytoplasm and/or in the nucleus. Intracellular proteins are involved in all aspects of cellular 
function and replication (including, e.g., signaling pathways); aberrant expression of such 
proteins often results in unregulated or disregulated cellular processes (see, e.g., Alberts (ed. 

25 1994) Molecular Biology of the Cell (3d ed.) Garland. For example, many intracellular 

proteins have enzymatic activity such as protein kinase activity, protein phosphatase activity, 
protease activity, nucleotide cyclase activity, polymerase activity and the like. Intracellular 
proteins also serve as docking proteins that are involved in organizing complexes of proteins, 
or targeting proteins to various subcellular localizations, and are involved in maintaining the 

30 structural integrity of organelles. 

An increasingly appreciated concept in characterizing proteins is the presence in the 
proteins of one or more structural motifs for which defined functions have been attributed. In 

34 



BNSDOCJO <WO 02Q983S8A2_L> 



WO 02/098358 PCT/US02/1 7594 



addition to the highly conserved sequences found in the enzymatic domain of proteins, highly 
conserved sequences have been identified in proteins that are involved in protein-protein 
interaction. For example, Src-homology-2 (SH2) domains bind tyrosine-phosphorylated 
targets in a sequence dependent manner. PTB domains, which are distinct from SH2 
5 domains, also bind tyrosine phosphorylated targets. SH3 domains bind to proline-rich 

targets. In addition, PH domains, tetratricopeptide repeats and WD domains to name only a 
few, have been shown to mediate protein-protein interactions. Some of these may also be 
involved in binding to phospholipids or other second messengers. As will be appreciated by 
one of ordinary skill in the art, these motifs can be identified on the basis of amino acid 

10 sequence; thus, an analysis of the sequence of proteins may provide insight into both the 

enzymatic potential of the molecule and/or molecules with which the protein may associate. 
One useful database is Pfam (protein families), which is a large collection of multiple 
sequence alignments and hidden Markov models covering many common protein domains. 
Versions are available via the internet from Washington University in St. Louis, the Sanger 

15 Center in England, and the Karolinska Institute in Sweden (see, e.g., Bateman, et al. (2000) 
Nuc. Acids Res. 28:263-266; Sonnhammer, et al. (1997) Proteins 28:405-420; Bateman, et al. 
(1999) Nuc. Acids Res. 27:260-262; and Sonnhammer, et al. (1998) Nuc. Acids Res. 26:320- 
322. 

In another embodiment, the prostate cancer sequences are transmembrane proteins. 

20 Transmembrane proteins are molecules that span a phospholipid bilayer of a cell. They may 
have an intracellular domain, an extracellular domain, or both. The intracellular domains of 
such proteins may have a number of functions including those already described for 
intracellular proteins. For example, the intracellular domain may have enzymatic activity 
and/or may serve as a binding site for additional proteins. Frequently the intracellular 

25 domain of transmembrane proteins serves both roles. For example certain receptor tyrosine 
kinases have both protein kinase activity and SH2 domains. In addition, autophosphorylation 
of tyrosines on the receptor molecule itself, creates binding sites for additional SH2 domain 
containing proteins. 

Transmembrane proteins may contain from one to many transmembrane domains. 
30 For example, receptor tyrosine kinases, certain cytokine receptors, receptor guanylyl cyclases 
and receptor serine/threonine protein kinases contain a single transmembrane domain. 
However, various other proteins including channels and adenylyl cyclases contain numerous 
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transmembrane domains. Many important cell surface receptors such as G protein coupled 
receptors (GPCRs) are classified as "seven transmembrane domain" proteins, as they contain 
7 membrane spanning regions. Characteristics of transmembrane domains include 
approximately 17 consecutive hydrophobic amino acids that may be followed by charged 
5 amino acids. Therefore, upon analysis of the amino acid sequence of a particular protein, the 
localization and number of transmembrane domains within the protein may be predicted (see, 
e.g., PSORT web site http://psort.nibb.ac.jp/). Important transmembrane protein receptors 
include, but are not limited to the insulin receptor, insulin-like growth factor receptor, human 
growth hormone receptor, glucose transporters, transferrin receptor, epidermal growth factor 

10 receptor, low density lipoprotein receptor, epidermal growth factor receptor, leptin receptor, 
and interleukin receptors, e.g., IL-1 receptor, IL-2 receptor, etc. 

The extracellular domains of transmembrane proteins are diverse; however, conserved 
motifs are found repeatedly among various extracellular domains. Conserved structure 
and/or functions have been ascribed to different extracellular motifs. Many extracellular 

15 domains are involved in binding to other molecules. In one aspect, extracellular domains are 
found on receptors. Factors that bind the receptor domain include circulating ligands, which 
may be peptides, proteins, or small molecules such as adenosine and the like. For example, 
growth factors such as EGF, FGF, and PDGF are circulating growth factors that bind to their 
cognate receptors to initiate a variety of cellular responses. Other factors include cytokines, 

20 mitogenic factors, neurotrophic factors and the like. Extracellular domains also bind to cell- 
associated molecules. In this respect, they mediate cell-cell interactions. Cell-associated 
ligands can be tethered to the cell, e.g., via a glycosylphosphatidylinositol (GPI) anchor, or 
may themselves be transmembrane proteins. Extracellular domains also associate with the 
extracellular matrix and contribute to the maintenance of the cell structure. 

25 Prostate cancer proteins that are transmembrane are particularly preferred in the 

present invention as they are readily accessible targets for immunotherapeutics, as are 
described herein. In addition, as outlined below, transmembrane proteins can be also useful 
in imaging modalities. Antibodies may be used to label such readily accessible proteins in 
situ. Alternatively, antibodies can also label intracellular proteins, in which case samples are 

30 typically permeablized to provide access to intracellular proteins.. In addition, some 

membrane proteins can be processed to release a soluble protein, or to expose a residual 
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fragment. Released soluble proteins may be useful diagnostic markers, processed residual 
protein fragments may be useful prostate markers of disease. 

It will also be appreciated by those in the art that a transmembrane protein can be 
made soluble by removing transmembrane sequences, e.g., through recombinant methods. 
5 Furthermore, transmembrane proteins that have been made soluble can be made to be 
secreted through recombinant means by adding an appropriate signal sequence. 

In another embodiment, the prostate cancer proteins are secreted proteins; the 
secretion of which can be either constitutive or regulated. These proteins may have a signal 
peptide or signal sequence that targets the molecule to the secretory pathway. Secreted 

10 proteins are involved in numerous physiological events; by virtue of their circulating nature, 
they often serve to transmit signals to various other cell types. The secreted protein may 
function in an autocrine manner (acting on the cell that secreted the factor), a paracrine 
manner (acting on cells in close proximity to the cell that secreted the factor), an endocrine 
manner (acting on cells at a distance, e.g, secretion into the blood stream), or an exocrine 

15 manner (secretion, e.g., through a duct or to adjacent epithelial surface as sweat glands, 

sebaceous glands, pancreatic ducts, lacrimal glands, mammary glands, sax producing glands 
of the ear, etc.). Thus secreted molecules find use in modulating or altering numerous aspects 
of physiology. Prostate cancer proteins that are secreted proteins are particularly preferred in 
the present invention as they serve as good targets for diagnostic markers, e.g., for blood, 

20 plasma, serum, or stool tests. Those which are enzymes may be antibody or small molecule 
targets. Others may be useful as vaccine targets, e.g., via CTL mechanisms. 

Use of prostate cancer nucleic acids 

As described above, prostate cancer sequence is initially identified by substantial 
25 nucleic acid and/or amino acid sequence homology or linkage to the prostate cancer 

sequences outlined herein. Such homology can be based upon the overall nucleic acid or 
amino acid sequence, and is generally determined as outlined below, using either homology 
programs or hybridization conditions. Typically, linked sequences on a mRNA are found on 
the same molecule. 

30 The prostate cancer nucleic acid sequences of the invention, e.g., the sequences in 

Tables 1A-4, can be fragments of larger genes, i.e., they are nucleic acid segments. "Genes" 
in this context includes coding regions, non-coding regions, and mixtures of coding and non- 
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coding regions. Accordingly, as will be appreciated by those in the art, using the sequences 
provided herein, extended sequences, in either direction, of the prostate cancer genes can be 
obtained, using techniques well known in the art for cloning either longer sequences or the 
full length sequences; see Ausubel, et aL, supra. Much can be done by informatics and many 
5 sequences can be clustered to include multiple sequences corresponding to a single gene, e.g., 
systems such as UniGene (see, http://www.ncbi.rdm.nih.gov/UniGene/). 

Once the prostate cancer nucleic acid is identified, it can be cloned and, if necessary, 
its constituent parts recombined to form the entire prostate cancer nucleic acid coding regions 
or the entire mRNA sequence. Once isolated from its natural source, e.g., contained within a 
10 plasmid or other vector or excised therefrom as a linear nucleic acid segment, the 

recombinant prostate cancer nucleic acid can be further-used as a probe to identify and isolate 
other prostate cancer nucleic acids, e.g., extended coding regions. It can also be used as a 
"precursor" nucleic acid to make modified or variant prostate cancer nucleic acids and 
proteins. 

15 The prostate cancer nucleic acids of the present invention are used in several ways. In 

a first embodiment, nucleic acid probes to the prostate cancer nucleic acids are made and 
attached to biochips to be used in screening and diagnostic methods, as outlined below, or for 
administration, e.g., for gene therapy, vaccine, and/or antisense applications. Alternatively, 
the prostate cancer nucleic acids that include coding regions of prostate cancer proteins can 

20 be put into expression vectors for the expression of prostate cancer proteins, again for 
screening purposes or for administration to a patient. 

In a preferred embodiment, nucleic acid probes to prostate cancer nucleic acids (both 
the nucleic acid sequences outlined in the figures and/or the complements thereof) are made. 
The nucleic acid probes attached to the biochip are designed to be substantially 

25 complementary to the prostate cancer nucleic acids, i.e., the target sequence (either the target 
sequence of the sample or to other probe sequences, e.g., in sandwich assays), such that 
hybridization of the target sequence and the probes of the present invention occurs. As 
outlined below, this complementarity need not be perfect; there may be base pair mismatches 
which will interfere with hybridization between the target sequence and the single stranded 

30 nucleic acids of the present invention. However, if the number of mutations is so great that 
no hybridization can occur under even the least stringent of hybridization conditions, the 
sequence is not a complementary target sequence. Thus, by "substantially complementary 55 
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herein is meant that the probes are sufficiently complementary to the target sequences to 
hybridize under normal reaction conditions, particularly high stringency conditions, as 
outlined herein. 

A nucleic acid probe is generally single stranded but can be partially single and 
5 partially double stranded. The strandedness of the probe is dictated by the structure, 

composition, and properties of the target sequence. In general, the nucleic acid probes range 
from about 8 to about 100 bases long, with from about 10 to about 80 bases being preferred, 
and from about 30 to about 50 bases being particularly preferred. That is, generally whole 
genes are not used. In some embodiments, much longer nucleic acids can be used, up to 

1 0 hundreds of bases. 

In a preferred embodiment, more than one probe per sequence is used, with either 
overlapping probes or probes to different sections of the target being used. That is, two, 
three, four or more probes, with three being preferred, are used to build in a redundancy for a 
particular target. The probes can be overlapping (i.e., have some sequence in common), or 

15 separate. In some cases, PCR primers may be used to amplify signal for higher sensitivity. 
As will be appreciated by those in the art, nucleic acids can be attached or 
immobilized to a solid support in a wide variety of ways. By "immobilized" and grammatical 
equivalents herein is meant the association or binding between the nucleic acid probe and the 
solid support is sufficient to be stable under the conditions of binding, washing, analysis, and 

20 removal as outlined below. The binding can typically be covalent or non-covalent. By "non- 
covalent binding" and grammatical equivalents herein is meant one or more of electrostatic, 
hydrophilic, and hydrophobic interactions. Included in non-covalent binding is the covalent 
attachment of a molecule, such as, streptavidin to the support and the non-covalent binding of 
the biotinylated probe to the streptavidin. By "covalent binding" and grammatical 

25 equivalents herein is meant that the two moieties, the solid support and the probe, are 
attached by at least one bond, including sigma bonds, pi bonds and coordination bonds. 
Covalent bonds can be formed directly between the probe and the solid support or can be 
formed by a cross linker or by inclusion of a specific reactive group on either the solid 
support or the probe or both molecules. Immobilization may also involve a combination of 

30 covalent and non-covalent interactions. 

In general, the probes are attached to the biochip in a wide variety of ways, as will be 
appreciated by those in the art. As described herein, the nucleic acids can either be 
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synthesized first, with subsequent attachment to the biochip, or can be directly synthesized on 
the biochip. 

The biochip comprises a suitable solid substrate. By "substrate" or "solid support 5 ' or 
other grammatical equivalents herein is meant a material that can be modified to contain 
5 discrete individual sites appropriate for the attachment or association of the nucleic acid 
probes and is amenable to at least one detection method. As will be appreciated by those in 
the art, the number of possible substrates are very large, and include, but are not limited to, 
glass and modified or functionalized glass, plastics (including acrylics, polystyrene and 
copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, 
10 polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica- 
based materials including silicon and modified silicon, carbon, metals, inorganic glasses, 
plastics, etc. In general, the substrates allow optical detection and do not appreciably 
fluoresce. A preferred substrate is described in WO0055627, herein incorporated by 
reference in its entirety. 

15 Generally the substrate is planar, although as will be appreciated by those in the art, 

other configurations of substrates may be used as well. For example, the probes may be 
placed on the inside surface of a tube, for flow-through sample analysis to minimize sample 
volume. Similarly, the substrate may be flexible, such as a flexible foam, including closed 
cell foams made of particular plastics. 

20 In a preferred embodiment, the surface of the biochip and the probe may be 

derivatized with chemical functional groups for subsequent attachment of the two. Thus, e.g., 
the biochip is derivatized with a chemical functional group including, but not limited to, 
amino groups, carboxy groups, oxo groups and thiol groups, with amino groups being 
particularly preferred. Using these functional groups, the probes can be attached using 

25 functional groups on the probes. For example, nucleic acids containing amino groups can be 
attached to surfaces comprising amino groups, e.g., using linkers as are known in the art; e.g., 
homo-or hetero-bifunctional linkers as are well known (see 1994 Pierce Chemical Company 
catalog, technical section on cross-linkers, pages 155-200). In addition, in some cases, 
additional linkers, such as alkyl groups (including substituted and heteroalkyl groups) may be 

30 used. 

In this embodiment, oligonucleotides are synthesized as is known in the art, and then 
attached to the surface of the solid support. As will be appreciated by those skilled in the art, 
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either the 5' or 3 ' terminus may be attached to the solid support, or attachment may be via an 
internal nucleoside. 

In another embodiment, the immobilization to the solid support may be very strong, 
yet non-covalent For example, biotinylated oligonucleotides can be made, which bind to 
5 surfaces covalently coated with streptavidin, resulting in attachment. 

Alternatively, the oligonucleotides may be synthesized on the surface, as is known in 
the art. For example, photoactivation techniques utilizing photopolymerization compounds 
and techniques are used. In a preferred embodiment, the nucleic acids can be synthesized in 
situ, using well known photolithographic techniques, such as those described in WO 

10 95/251 16; WO 95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references cited 
within, all of which are expressly incorporated by reference; these methods of attachment 
form the basis of the Afiymetrix GeneChip™ technology. 

Often, amplification-based assays are performed to measure the expression level of 
prostate cancer-associated sequences. These assays are typically performed in conjunction 

15 with reverse transcription. In such assays, a prostate cancer-associated nucleic acid sequence 
acts as a template in an amplification reaction (e.g., Polymerase Chain Reaction, or PCR). la 
a quantitative amplification, the amount of amplification product will be proportional to the 
amount of template in the original sample. Comparison to appropriate controls provides a 
measure of the amount of prostate cancer-associated RNA. Methods of quantitative 

20 amplification are well known to those of skill in the art. Detailed protocols for quantitative 
PCR are provided, e.g., in Tunis, et al. (1990) PCR Protocols: A Guide to Methods and 
Applications Academic Press. 

In some embodiments, a TaqMan based assay is used to measure expression. 
TaqMan based assays use a fluorogenic oligonucleotide probe that contains a 5' fluorescent 

25 dye and a 3' quenching agent. The probe hybridizes to a PCR product, but cannot itself be 
extended due to a blocking agent at the 3 ' end. When the PCR product is amplified in 
subsequent cycles, the 5' nuclease activity of the polymerase, e.g., AmpliTaq, results in the 
cleavage of the TaqMan probe. This cleavage separates the 5' fluorescent dye and the 3' 
quenching agent, thereby resulting in an increase in fluorescence as a function of 

30 amplification (see, e.g., literature provided by Perkin-Elmer, e.g., www2.perkin-elmer.com). 
Other suitable amplification methods include, but are not limited to, ligase chain 
reaction (LCR) (see Wu and Wallace (1989) Genomics 4:560-569, Landegren, et al. (1988) 
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Science 241:1077-1080, and Barringer, et al. (1990) Gene 89:1 17-122), transcription 
amplification (Kwoh, et al. (1989) Proc. Natl. Acad. Sci. USA 86:1173-1177), self-sustained 
sequence replication (Guatelli, et al. (1990) Proc. Nat. Acad. Sci. USA 87:1874-1878), dot 
PCR, and linker adapter PCR, etc. 

5 

Expression of prostate cancer proteins from nucleic acids 

In a preferred embodiment, prostate cancer nucleic acids, e.g., encoding prostate 
cancer proteins are used to make a variety of expression vectors to express prostate cancer 
proteins which can then be used in screening assays, as described below. Expression vectors 

10 and recombinant DNA technology are well known to those of skill in the art (see, e.g., 

Ausubel, supra, and Fernandez and Hoeffler (eds. 1999) Gene Expression Systems Academic 
Press) and are used to express proteins. The expression vectors may be either self-replicating 
extrachromosomal vectors or vectors which integrate into a host genome. Generally, these 
expression vectors include transcriptional and translational regulatory nucleic acid operably 

15 linked to the nucleic acid encoding the prostate cancer protein. The term "control sequences" 
refers to DNA sequences used for the expression of an operably linked coding sequence in a 
particular host organism. Control sequences that are suitable for prokaryotes, e.g., include a 
promoter, optionally an operator sequence, and a ribosome binding site. Eukaryotic cells are 
known to utilize promoters, polyadenylation signals, and enhancers. 

20 Nucleic acid is "operably linked" when it is placed into a functional relationship with 

another nucleic acid sequence. For example, DNA for a presequence or secretory leader is 
operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in 
the secretion of the polypeptide; a promoter or enhancer is operably linked to a coding 
sequence if it affects the transcription of the sequence; a ribosome binding site is operably 

25 linked to a coding sequence if it is positioned so as to facilitate translation, and sequences 
may be operably linked when they are physically linked on the same molecule. Generally, 
"operably linked" means that the DNA sequences being linked are contiguous, and, in the 
case of a secretory leader, contiguous and in reading phase. However, enhancers do not have 
to be contiguous. Linking is typically accomplished by ligation at convenient restriction 

30 sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are used in 

accordance with conventional practice. Transcriptional and translational regulatory nucleic 
acid will generally be appropriate to the host cell used to express the prostate cancer protein. 
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Numerous types of appropriate expression vectors, and suitable regulatory sequences are 
known in the art for a variety of host cells. 

In general, transcriptional and translational regulatory sequences may include, but are 
not limited to, promoter sequences, ribosomal binding sites, transcriptional start and stop 
5 sequences, translational start and stop sequences, and enhancer or activator sequences. In a 
preferred embodiment, the regulatory sequences include a promoter and transcriptional start 
and stop sequences. 

Promoter sequences encode either constitutive or inducible promoters. The promoters 
may be either naturally occurring promoters or hybrid promoters. Hybrid promoters, which 
10 combine elements of more than one promoter, are also known in the art, and are usefid in the 
present invention. 

In addition, an expression vector may comprise additional elements. For example, the 
expression vector may have two replication systems, thus allowing it to be maintained in two 
organisms, e.g., in mammalian or insect cells for expression and in a prokaryotic host for 

15 cloning and amplification. Furthermore, for integrating expression vectors, the expression 

vector contains at least one sequence homologous to the host cell genome, and preferably two 
homologous sequences which flank the expression construct. The integrating vector may be 
directed to a specific locus in the host cell by selecting the appropriate homologous sequence 
for inclusion in the vector. Constructs for integrating vectors are well known in the art (e.g., 

20 Fernandez and Hoeffler, supra). 

In addition, in a preferred embodiment, the expression vector contains a selectable 
marker gene to allow the selection of transformed host cells. Selection genes are well known 
in the art and will vary with the host cell used. 

The prostate cancer proteins of the present invention are produced by culturing a host 

25 cell transformed with an expression vector containing nucleic acid encoding a prostate cancer 
protein, under the appropriate conditions to induce or cause expression of the prostate cancer 
protein. Conditions appropriate for prostate cancer protein expression will vary with the 
choice of the expression vector and the host cell, and will be easily ascertained by one skilled 
in the art through routine experimentation or optimization. For example, the use of 

30 constitutive promoters in the expression vector will require optimizing the growth and 

proliferation of the host cell, while the use of an inducible promoter requires the appropriate 
growth conditions for induction. In addition, in some embodiments, the timing of the harvest 
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is important. For example, the baculoviral systems used in insect cell expression are lytic 
viruses, and thus harvest time selection can be crucial for product yield. 

Appropriate host cells include yeast, bacteria, archaebacteria, fungi, and insect and 
animal cells, including mammalian cells. Of particular interest are Saccharomyces cerevisiae 
5 and other yeasts, E. coli, Bacillus subtilis, Sf9 cells, CI 29 cells, 293 cells, Neurospora, BHK, 
CHO, COS, HeLa cells, HUVEC (human umbilical vein endothelial cells), THP1 cells (a 
macrophage cell line) and various other human cells and cell lines. 

In a preferred embodiment, the prostate cancer proteins are expressed in mammalian 
cells. Mammalian expression systems are also known in the art, and include retroviral and 

10 adenoviral systems. One expression vector system is a retroviral vector system such as is 
generally described in PCT/US97/01019 and PC17US97/01048, both of which are hereby 
expressly incorporated by reference. Of particular use as mammalian promoters are the 
promoters from mammalian viral genes, since the viral genes are often highly expressed and 
have a broad host range. Examples include the S V40 early promoter, mouse mammary tumor 

15 virus LTR promoter, adenovirus major late promoter, herpes simplex virus promoter, and the 
CMV promoter (see, e.g., Fernandez and HoefHer, supra). Typically, transcription 
termination and polyadenylation sequences recognized by mammalian cells are regulatory 
regions located 3' to the translation stop codon and thus, together with the promoter elements, 
flank the coding sequence. Examples of transcription terminator and polyadenylation signals 

20 include those derived form SV40. 

The methods of introducing exogenous nucleic acid into mammalian hosts, as well as 
other hosts, is well known in the art, and will vary with the host cell used. Techniques 
include dextran-mediated transfection, calcium phosphate precipitation, polybrene mediated 
transfection, protoplast fusion, electroporation, viral infection, encapsulation of the 

25 polynucleotide(s) in liposomes, and direct microinjection of the DNA into nuclei. 

In a preferred embodiment, prostate cancer proteins are expressed in bacterial 
systems. Bacterial expression systems are well known in the art. Promoters from 
bacteriophage may also be used and are known in the art. In addition, synthetic promoters 
and hybrid promoters are also useful; e.g., the tac promoter is a hybrid of the trp and lac 

30 promoter sequences. Furthermore, a bacterial promoter can include naturally occurring 

promoters of non-bacterial origin that have the ability to bind bacterial KNA polymerase and 
initiate transcription. In addition to a functioning promoter sequence, an efficient ribosome 
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binding site is desirable. The expression vector may also include a signal peptide sequence 
that provides for secretion of the prostate cancer protein in bacteria. The protein is either 
secreted into the growth media (gram-positive bacteria) or into the periplasmic space, located 
between the inner and outer membrane of the cell (gram-negative bacteria). The bacterial 
5 expression vector may also include a selectable marker gene to allow for the selection of 
bacterial strains that have been transformed. Suitable selection genes include genes which 
render the bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, 
kanamycin, neomycin and tetracycline. Selectable markers also include biosynthetic genes, 
such as those in the histidine, tryptophan and leucine biosynthetic pathways. These 

10 components are assembled into expression vectors. Expression vectors for bacteria are well 
known in the art, and include vectors for Bacillus subtilis, E. coli, Streptococcus cremoris, 
and Streptococcus lividans, among others (e.g., Fernandez and Hoeffler, supra). The 
bacterial expression vectors are transformed into bacterial host cells using techniques well 
known in the art, such as calcium chloride treatment, electroporation, and others. 

15 In one embodiment, prostate cancer proteins are produced in insect cells. Expression 

vectors for the transformation of insect cells, and in particular, baculovirus-based expression 
vectors, are well known in the art. 

In a preferred embodiment, prostate cancer protein is produced in yeast cells. Yeast 
expression systems are well known in the art, and include expression vectors for 

20 Saccharomyces cerevisiae, Candida albicans and C. maltosa, Hansenula polymorpha, 
Kluyveromyces fragilis and K. lactis, Pichia guillerimondii and P. pastoris, 
Schizosaccharomyces pombe, and Yarrowia lipolytica. 

The prostate cancer protein may also be made as a fusion protein, using techniques 
well known in the art. Thus, e.g., for the creation of monoclonal antibodies, if the desired 

25 epitope is small, the prostate cancer protein may be fused to a carrier protein to form an 
immimogen. Alternatively, the prostate cancer protein maybe made as a fusion protein to 
increase expression, or for other reasons. For example, when the prostate cancer protein is a 
prostate cancer peptide, the nucleic acid encoding the peptide may be linked to other nucleic 
acid for expression purposes. 

30 In a preferred embodiment, the prostate cancer protein is purified or isolated after 

expression. Prostate cancer proteins may be isolated or purified in a variety of ways known 
to those skilled in the art depending on what other components are present in the sample. 
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Standard purification methods include electrophoretic, molecular, immunological and 
chromatographic techniques, including ion exchange, hydrophobic, affinity, and reverse- 
phase HPLC chromatography, and chromatofocusing. For example, the prostate cancer 
protein may be purified using a standard anti-prostate cancer protein antibody column. 
5 Ultrafiltration and diafiltration techniques, in conjunction with protein concentration, are also 
useful. For general guidance in suitable purification techniques, see Scopes (1982) Protein 
Purification Springer- Verlag. The degree of purification necessary will vary depending on 
the use of the prostate cancer protein. In some instances no purification will be necessary. 
Once expressed and purified if necessary, the prostate cancer proteins and nucleic 
10 acids are useful in a number of applications. They may be used as immunoselection reagents, 
as vaccine reagents, as screening agents, etc. 

Variants of prostate cancer proteins 

In one embodiment, the prostate cancer proteins are derivative or variant prostate 
15 cancer proteins as compared to the wild-type sequence. That is, as outlined more fully below, 
the derivative prostate cancer peptide will often contain at least one amino acid substitution, 
deletion or insertion, with amino acid substitutions being particularly preferred. The amino 
acid substitution, insertion, or deletion may occur at most any residue within the prostate 
cancer peptide. 

20 Also included within one embodiment of prostate cancer proteins of the present 

invention are amino acid sequence variants. These variants typically fall into one or more of 
three classes: substitutional, insertional, or deletional variants. These variants ordinarily are 
prepared by site specific mutagenesis of nucleotides in the DNA encoding the prostate cancer 
protein, using cassette or PCR mutagenesis or other techniques well known in the art, to 

25 produce DNA encoding the variant, and thereafter expressing the DNA in recombinant cell 
culture as outlined above. However, variant prostate cancer protein fragments having up to 
about 100-150 residues may be prepared by in vitro synthesis using established techniques. 
Amino acid sequence variants are characterized by the predetermined nature of the variation, 
a feature that sets them apart from naturally occurring allelic or interspecies variation of the 

30 prostate cancer protein amino acid sequence. The variants typically exhibit the same 

qualitative biological activity as the naturally occurring analogue, although variants can also 
be selected which have modified characteristics as will be more fully outlined below. 
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While the site or region for introducing an amino acid sequence variation is 
predetermined, the mutation per se need not be predetermined. For example, in order to 
optimize the performance of a mutation at a given site, random mutagenesis may be 
conducted at the target codon or region and the expressed prostate cancer variants screened 
5 for the optimal combination of desired activity. Techniques for making substitution 

mutations at predetermined sites in DNA having a known sequence are well known, e.g., 
Ml 3 primer mutagenesis and PCR mutagenesis. Screening of the mutants is done using 
assays of prostate cancer protein activities. 

Amino acid substitutions are typically of single residues; insertions usually will be on 
10 the order of from about 1 to 20 amino acids, although considerably larger insertions may be 
tolerated. Deletions range from about 1 to about 20 residues, although in some cases 
. deletions may be much larger. 

Substitutions, deletions, insertions or a combination thereof may be used to arrive at a 
final derivative. Generally these changes are done on a few amino acids to minimize the 
1 5 alteration of the molecule. However, larger changes may be tolerated in certain 

circumstances. When small alterations in the characteristics of the prostate cancer protein are 
desired, substitutions are generally made in accordance with the amino acid substitution 
relationships provided in the definition section. 

The variants typically exhibit the same qualitative biological activity and will elicit 
20 the same immune response as the naturally-occurring analog, although variants also are 

selected to modify the characteristics of the prostate cancer proteins as needed. Alternatively, 
the variant may be designed such that the biological activity of the prostate cancer protein is 
altered. For example, glycosylation sites may be altered or removed. 

Substantial changes in function or immunological identity are made by selecting 
25 substitutions that are less conservative than those described above. For example, 

substitutions may be made which more significantly affect: the structure of the polypeptide 
backbone in the area of the alteration, for example the alpha-helical or beta-sheet structure; 
the charge or hydrophobicity of the molecule at the target site; or the bulk of the side chain. 
The substitutions which in general are expected to produce the greatest changes in the 
30 polypeptide's properties are those in which (a) a hydrophilic residue, e.g., serinyl or threonyl 
is substituted for (or by) a hydrophobic residue, e.g., leucyl, isoleucyl, phenylalanyl, valyl or 
alanyl; (b) a cysteine or proline is substituted for (or by) another residue; (c) a residue having 
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an electropositive side chain, e.g., lysyl, arginyl, or histidyl, is substituted for (or by) an 
electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a bulky side chain, 
e.g., phenylalanine, is substituted for (or by) one not having a side chain, e.g., glycine. 

Covalent modifications of prostate cancer polypeptides are included within the scope 
5 of this invention. One type of covalent modification includes reacting targeted amino acid 
residues of a prostate cancer polypeptide with an organic derivatizing agent that is capable of 
reacting with selected side chains or the N-or C-terminal residues of a prostate cancer 
polypeptide. Derivatization with bifunctional agents is useful, for instance, for crosslinking 
prostate cancer polypeptides to a water-insoluble support matrix or surface for use in the 

10 method for purifying anti-prostate cancer polypeptide antibodies or screening assays, as is 
more fully described below. Commonly used crosslinking agents include, e.g., 1,1- 
bis(diazoacetyl)-2-phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, e.g., esters 
with 4-azidosalicylic acid, homobifunctional imidoesters, including disuccinimidyl esters 
such as 3,3 , -dithiobis(succinimidylpropionate) J bifunctional maleimides such as bis-N- 

15 maleimido-l,8-octane and agents such as methyl-3-((p-azidophenyl)di1iiio)propioiimdate. 

Other modifications include deamidation of glutaminyl and asparaginyl residues to 
the corresponding glutamyl and aspartyl residues, respectively, hydroxylation of proline and 
lysine, phosphorylation of hydroxyl groups of serinyl, threonyl or tyrosyl residues, 
methylation of the amino groups of the lysine, arginine, and histidine side chains (e.g., pp. 

20 79-86, Creighton (1983) Proteins: Structure and Molecular Properties Freeman), acetylation 
of the N-terminal amine, and amidation of a C-terminal carboxyl group. 

Another type of covalent modification of the prostate cancer polypeptide included 
within the scope of this invention comprises altering the native glycosylation pattern of the 
polypeptide. "Altering the native glycosylation pattern" is intended for purposes herein to 

25 mean deleting one or more carbohydrate moieties found in native sequence prostate cancer 
polypeptide, and/or adding one or more glycosylation sites that are not present in the native 
sequence prostate cancer polypeptide. Glycosylation patterns can be altered in many ways. 
For example the use of different cell types to express prostate cancer-associated sequences 
can result in different glycosylation patterns. 

30 Addition of glycosylation sites to prostate cancer polypeptides may also be 

accomplished by altering the amino acid sequence thereof. The alteration may be made, e.g., 
by the addition of, or substitution by, one or more serine or threonine residues to the native 
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sequence prostate cancer polypeptide (for O-linked glycosylation sites). The prostate cancer 
amino acid sequence may optionally be altered through changes at the DNA level, 
particularly by mutating the DNA encoding the prostate cancer polypeptide at preselected 
bases such that codons are generated that will translate into the desired amino acids. 
5 Another means of increasing the number of carbohydrate moieties on the prostate 

cancer polypeptide is by chemical or enzymatic coupling of glycosides to the polypeptide. 
Such methods are described in the art, e.g., in WO 87/05330, and pp. 259-306 in Aplin and 
Wriston (1981) CRC Crit. Rev. Biochem. 

Removal of carbohydrate moieties present on the prostate cancer polypeptide may be 

1 0 accomplished chemically or enzymatically or by mutational substitution of codons encoding 
for amino acid residues that serve as targets for glycosylation. Chemical deglycosylation 
techniques are known in the art and described, e.g., by Hakimuddin, et al. (1987) Arch. 
Biochem. Biophvs. 259:52-57; and Edge, et al. (1981) Anal. Biochem. 118:131-137. 
Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the use of a 

15 variety of endo-and exo-glycosidases as described by Thotakura, et al. (1987) Meth. 
Enzvmol . 138:350-359. 

Another type of covalent modification of prostate cancer comprises linking the 
prostate cancer polypeptide to one of a variety of nonproteinaceous polymers, e.g., 
polyethylene glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth in 

20 U.S. Patent Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192; or 4,179,337. 

Prostate cancer polypeptides of the present invention may also be modified in a way 
to form chimeric molecules comprising a prostate cancer polypeptide fused to another, 
heterologous polypeptide or amino acid sequence. In one embodiment, such a chimeric 
molecule comprises a fusion of a prostate cancer polypeptide with a tag polypeptide which 

25 provides an epitope to which an anti-tag antibody can selectively bind. The epitope tag is 
generally placed at the amino-or carboxyl-terminus of the prostate cancer polypeptide. The 
presence of such epitope-tagged forms of a prostate cancer polypeptide can be detected using 
an antibody against the tag polypeptide. Also, provision of the epitope tag enables the 
prostate cancer polypeptide to be readily purified by affinity purification using an anti-tag 

30 antibody or another type of affinity matrix that binds to the epitope tag. In an alternative 

embodiment, the chimeric molecule may comprise a fusion of a prostate cancer polypeptide 

49 



BNSDOCID: <WO 02098358A2J_> 



WO 02/098358 



PCT/US02/17594 



with an immunoglobulin or a particular region of an immunoglobulin. For a bivalent form of 
the chimeric molecule, such a fusion could be to the Fc region of an IgG molecule. 

Various tag polypeptides and their respective antibodies are well known in the art. 
Examples include poly-histidine (poly-his) or poly-histidine-glycine (poly-his-gly) tags; HIS6 
5 and metal chelation tags, the flu HA tag polypeptide and its antibody 12CA5 (Field, et al. 
(1988) Mol. Cell. Biol . 8:2159-2165; the c-myc tag and the 8F9, 3C7, 6E10, G4, B7, and 
9E10 antibodies thereto (Evan, et al. (1985) Molecular and Cellular Biology 5:3610-3616); 
and the Herpes Simplex virus glycoprotein D (gD) tag and its antibody (Paborsky, et al. 
(1990) Protein Engineering 3:547-553). Other tag polypeptides include the Flag-peptide 

10 (Hopp, et al. (1988) BioTechnology 6:1204-1210); the KT3 epitope peptide (Martin, et al. 
(1992) Science 255:192-194); tubulin epitope peptide (Skinner, et al. (1991) J. Biol. Chem. 
266:15163-15166); and the T7 gene 10 protein peptide tag (Lutz-Freyermuth, et al. (1990) 
Proc. Natl. Acad. Sci. USA 87:6393-6397). 

Also included are other prostate cancer proteins of the prostate cancer family, and 

15 prostate cancer proteins from other organisms, which are cloned and expressed as outlined 

below. Thus, probe or degenerate polymerase chain reaction (PCR) primer sequences may be 
used to find other related prostate cancer proteins from humans or other organisms. As will 
be appreciated by those in the art, particularly useful probe and/or PCR primer sequences 
include the unique areas of the prostate cancer nucleic acid sequence. As is generally known 

20 in the art, preferred PCR primers are from about 15 to about 35 nucleotides in length, with 
from about 20 to about 30 being preferred, and may contain inosine as needed. The 
conditions for the PCR reaction are well known in the art (e.g., Innis, PCR Protocols- supra). 

Antibodies to prostate cancer proteins 

25 lii a preferred embodiment, when the prostate cancer protein is to be used to generate 

antibodies, e.g., for immunotherapy or immunodiagnosis, the prostate cancer protein should 
share at least one epitope or determinant with the full length protein. By "epitope" or 
"determinant" herein is typically meant a portion of a protein which will generate and/or bind 
an antibody or T-cell receptor in the context of MHC. Thus, in most instances, antibodies 

30 made to a smaller prostate cancer protein will be able to bind to the full-length protein, 
particularly linear epitopes. Li a preferred embodiment, the epitope is unique; that is, 
antibodies generated to a unique epitope show little or no cross-reactivity. 
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Methods of preparing polyclonal antibodies are known to the skilled artisan (e.g., 
Coligan, supra; and Harlow and Lane, supra). Polyclonal antibodies can be raised in a 
mammal, e.g., by one or more injections of an immunizing agent and, if desired, an adjuvant. 
Typically, the immunizing agent and/or adjuvant will be injected in the mammal by multiple 
5 subcutaneous or intraperitoneal injections. The immunizing agent may include a protein 
encoded by a nucieic acid of the figures or fragment thereof or a fusion protein thereof. It 
may be useful to conjugate the immunizing agent to a protein known to be immunogenic in 
the mammal being immunized. Examples of such immunogenic proteins include but are not 
limited to keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, and soybean 

10 trypsin inhibitor. Examples of adjuvants which may be employed include Freund's complete 
adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic trehalose 
dicorynomycolate). The immunization protocol may be selected by one skilled in the art 
without undue experimentation. 

The antibodies may, alternatively, be monoclonal antibodies. Monoclonal antibodies 

15 may be prepared using hybridoma methods, such as those described by Kohler and Milstein 
(1975) Nature 256:495. In a hybridoma method, a mouse, hamster, or other appropriate host 
animal, is typically immunized with an immunizing agent to elicit lymphocytes that produce 
or are capable of producing antibodies that will specifically bind to the immunizing agent. 
Alternatively, the lymphocytes may be immunized in vitro. The immunizing agent will 

20 typically include a polypeptide encoded by a nucleic acid of Tables 1 A-4 or fragment thereof, 
or a fusion protein thereof. Generally, either peripheral blood lymphocytes ("PBLs") are 
used if cells of human origin are desired, or spleen cells or lymph node cells are used if non- 
human mammalian sources are desired. The lymphocytes are then fused with an 
immortalized cell line using a suitable fusing agent, such as polyethylene glycol, to form a 

25 hybridoma cell (see pp. 59-103 in Goding (1986) Monoclonal Antibodies: Principles and 

Practice Academic Press). Immortalized cell lines are usually transformed mammalian cells, 
particularly myeloma cells of rodent, bovine and human origin. Usually, rat or mouse 
myeloma cell lines are employed. The hybridoma cells may be cultured in a suitable culture 
medium that preferably contains one or more substances that inhibit the growth or survival of 

30 the unfused, immortalized cells. For example, if the parental cells lack the enzyme 

hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium 
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for the hybridomas typically will include hypoxantbine, aminopterin, and thymidine ("HAT 
medium")* which substances prevent the growth of HGPRT-deficient cells. 

In one embodiment, the antibodies are bispecific antibodies. Bispecific antibodies are 
monoclonal, preferably human or humanized, antibodies that have binding specificities for at 
5 least two different antigens or that have binding specificities for two epitopes on the same 
antigen. In one embodiment, one of the binding specificities is for a protein encoded by a 
nucleic acid of Tables 1A-4 or a fragment thereof, the other one is for another antigen, and 
preferably for a cell-surface protein or receptor or receptor subunit, preferably one that is 
tumor specific. Alternatively, tetramer-type technology may create multivalent reagents. 
10 In a preferred embodiment, the antibodies to prostate cancer protein are capable of 

reducing or eliminating a biological function of a prostate cancer protein, as is described 
below. That is, the addition of anti-prostate cancer protein antibodies (either polyclonal or 
preferably monoclonal) to prostate cancer tissue (or cells containing prostate cancer) may 
reduce or eliminate the prostate cancer. Generally, at least a 25% decrease in activity, 
1 5 growth, size or the like is preferred, with at least about 50% being particularly preferred and 
about a 95-100% decrease being especially preferred. 

In a preferred embodiment the antibodies to the prostate cancer proteins are 
humanized antibodies (e.g., Xenerex Biosciences; Medarex, Inc.; Abgenix, Inc.; Protein 
Design Labs, Inc.). Humanized forms of non-human (e.g., murine) antibodies are chimeric 
20 molecules of immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, 
Fab, Fab', F(ab')2 or other antigen-binding subsequences of antibodies) which contain 
minimal sequence derived from non-human immunoglobulin. Humanized antibodies include 
human immunoglobulins (recipient antibody) in which residues from a complementary 
determining region (CDR) of the recipient are replaced by residues from a CDR of a non- 
25 human species (donor antibody) such as mouse, rat or rabbit having the desired specificity, 
affinity and capacity. In some instances, Fv framework residues of the human 
immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies 
may also comprise residues which are found neither in the recipient antibody nor in the 
imported CDR or framework sequences. In general, a humanized antibody will comprise 
30 substantially all of at least one, and typically two, variable domains, in which all or 

substantially all of the CDR regions correspond to those of a non-human immunoglobulin 
and all or substantially all of the framework (FR) regions are those of a human 
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immunoglobulin consensus sequence. The humanized antibody optimally also will comprise 
at least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones, et al. (1986) Nature 321:522-525; Riechmann, et al. (1988) Nature 
332:323-329; and Presta (1992) Curr. Op. Struct. Biol 2:593-596). Humanization can be 
5 essentially performed following methods of Winter and co-workers (see, e.g., Jones, et al. 

(1986) Nature 321 :522-525; Riechmann, et al. (1988) Nature 332:323-327; and Verhoeyen, et 
al. (1988) Science 239:1534-1536), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. Accordingly, such humanized antibodies are 
chimeric antibodies (U.S. Patent No. 4,816,567), wherein substantially less than an intact 
1 0 human variable domain has been substituted by the corresponding sequence from a non- 
human species. 

Human antibodies can also be produced using various techniques known in the art, 
including phage display libraries (Hoogenboom and Winter (1991) J. Mol. Biol . 227:381- 
388; Marks, et al. (1991) J. Mol. Biol . 222:581-597) or the preparation of human monoclonal 

15 antibodies (e.g., p77 in Cole, et al. (1985) Monoclonal Antibodies and Cancer Therapy Liss; 
and Boemer, et al. (1991) J. Immunol. 147(l):86-95). Similarly, human antibodies can be 
made by introducing of human immunoglobulin loci into transgenic animals, e.g., mice in 
which the endogenous immunoglobulin genes have been partially or completely inactivated. 
Upon challenge, human antibody production is observed, which closely resembles that seen 

20 in humans in most respects, including gene rearrangement, assembly, and antibody repertoire. 
This approach is described, e.g., in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 
5,625,126; 5,633,425; 5,661,016, and in the following scientific publications: Marks, et al. 
(1992) Bio/Technology 10:779-783; Lonberg, et al. (1994) Nature 368:856-859; Morrison 
(1994) Nature 368:812-13; Fishwild, et aL (1996) Nature Biotechnology 14:845-51; 

25 Neuberger (1996) Nature Biotechnology 14:826; Lonberg and Huszar (1995) Intern. Rev. 
Immunol. 13:65-93. 

By immunotherapy is meant treatment of prostate cancer with an antibody raised 
against prostate cancer proteins. As used herein, immunotherapy can be passive or active. 
Passive immunotherapy as defined herein is the passive transfer of antibody to a recipient 
30 (patient). Active immunization is the induction of antibody and/or T-cell responses in a 

recipient (patient). Induction of an immune response is the result of providing the recipient 
with an antigen to which antibodies are raised. As appreciated by one of ordinary skill in the 
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art, the antigen may be provided by injecting a polypeptide against which antibodies are 
desired to be raised into a recipient, or contacting the recipient with a nucleic acid capable of 
expressing the antigen and under conditions for expression of the antigen, leading to an 
immune response. 

5 In a preferred embodiment the prostate cancer proteins against which antibodies are 

raised are secreted proteins as described above. Without being bound by theory, antibodies 
used for treatment, bind and prevent the secreted protein from binding to its receptor, thereby 
inactivating the secreted prostate cancer protein. 

In another preferred embodiment, the prostate cancer protein to which antibodies are 

10 raised is a transmembrane protein. Without being bound by theory, antibodies used for 
treatment bind the extracellular domain of the prostate cancer protein and prevent it from 
binding to other proteins, such as circulating ligands or cell-associated molecules. The 
antibody may cause down-regulation of the transmembrane prostate cancer protein. As will 
be appreciated by one of ordinary skill in the art, the antibody may be a competitive, non- 

15 competitive or uncompetitive inhibitor of protein binding to the extracellular domain of the 
prostate cancer protein. The antibody is also often an antagonist of the prostate cancer 
protein. Further, the antibody may prevent activation of the transmembrane prostate cancer 
protein. In one aspect, when the antibody prevents the binding of other molecules to the 
prostate cancer protein, the antibody prevents growth of the cell. The antibody may also be 

20 used to target or sensitize the cell to cytotoxic agents, including, but not limited to TNF-cc, 
TNF-p, IL-1, INF-y, and IL-2, or chemo therapeutic agents including 5FU, vinblastine, 
actinomycin D, cisplatin, methotrexate, and the like. In some instances the antibody belongs 
to a sub-type that activates serum complement when complexed with the transmembrane 
protein thereby mediating cytotoxicity or antigen-dependent cytotoxicity (ADCC). Thus, 

25 prostate cancer is treated by administering to a patient antibodies directed against the 

transmembrane prostate cancer protein. Antibody-labeling may activate a co-toxin, localize a 
toxin payload, or otherwise provide means to locally ablate cells. 

In another preferred embodiment, the antibody is conjugated to an effector moiety. 
The effector moiety can be a labeling moiety such as a radioactive label or fluorescent label, 

30 or can be a therapeutic moiety, hi one aspect the therapeutic moiety is a small molecule that 
modulates the activity of the prostate cancer protein. In another aspect the therapeutic moiety 
modulates the activity of molecules associated with or in close proximity to the prostate 
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cancer protein. The therapeutic moiety may inhibit enzymatic activity such as protease or 
collagenase or protein kinase activity associated with prostate cancer. 

In a preferred embodiment, the therapeutic moiety can also be a cytotoxic agent In 
this method, targeting the cytotoxic agent to prostate cancer tissue or cells, results in a 
5 reduction in the number of afflicted cells, thereby reducing symptoms associated with 

prostate cancer. Cytotoxic agents are numerous and varied and include, but are not limited 
to, cytotoxic drugs or toxins or active fragments of such toxins. Suitable toxins and their 
corresponding fragments include diphtheria A chain, exotoxin A chain, ricin A chain, abrin A 
chain, curcin, crotin, phenomycin, enomycin, saporin, auristatin, and the like. Cytotoxic 

10 agents also include radiochemicals made by conjugating radioisotopes to antibodies raised 
against prostate cancer proteins, or binding of a radionuclide to a chelating agent that has 
been covalently attached to the antibody. Targeting the therapeutic moiety to transmembrane 
prostate cancer proteins not only serves to increase the local concentration of therapeutic 
moiety in the prostate cancer afflicted area, but also serves to reduce deleterious side effects, 

15 e.g., by binding to normal tissues, that may be associated with the therapeutic moiety. 

In another preferred embodiment, the prostate cancer protein against which the 
antibodies are raised is an intracellular protein. In this case, the antibody may be conjugated 
to a protein which facilitates entry into the cell. In one case, the antibody enters the cell by 
endocytosis. In another embodiment, a nucleic acid encoding the antibody is administered to 

20 the individual or cell. Moreover, wherein the prostate cancer protein can be targeted within a 
cell, i.e., the nucleus, an antibody thereto contains a signal for that target localization, i.e., a 
nuclear localization signal. 

The prostate cancer antibodies of the invention specifically bind to prostate cancer 
proteins. By "specifically bind" herein is meant that the antibodies bind to the protein with a 

25 Kd of at least about 0. 1 mM, more usually at least about 1 \sM, preferably at least about 0. 1 
jxM or better, and most preferably, 0.01 pM or better. Selectivity of binding is also 
important. 

Detection of prostate cancer sequence for diagnostic and therapeutic applications 

30 In one aspect, the RNA expression levels of genes are determined for different 

cellular states in the prostate cancer phenotype. After androgen ablation therapy, cells that 
survive the therapy undergo a period of quiescence followed at sometime later by active cell 
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division. As explained above, there are a variety of expression patterns characteristic of the 
prostate cancer genes involved in androgen-independent prostate cancer. Some genes are 
expressed early in the time course following ablation therapy, then drop off in expression, 
and then express again with emergence of androgen-independence (hi-lo-hi pattern in 1 A). 
5 Other genes are expressed early in the time course following ablation therapy, then drop off 
in expression, and do not express again with emergence of androgen-independence (hi-lo-lo 
pattern in Table 1 A). Still other genes are not expressed early in the time course, but express 
only with emergence of androgen-independence (lo-lo-hi pattern in Table 1 A). Other genes 
are not expressed early in the time course, but then express as androgen is withdrawn and 

10 continue to express with emergence of androgen-independence (lo-hi-hi pattern in Table 1 A). 
Finally, some genes are not expressed early in the time course, but then express as androgen 
is withdrawn and drop off again with emergence of androgen-independence (lo-hi-lo pattern 
in Table 1 A). Thus, the data suggest that different antigens are expressed in quiescent cells 
and actively dividing androgen-independent prostate cancer cells. 

15 In another aspect, the RNA expression levels of genes are determined for different 

cellular states in the prostate cancer phenotype. After androgen ablation therapy, cells that 
survive the therapy undergo a period of quiescence followed at sometime later by active cell 
division. As explained above, there are a variety of expression patterns characteristic of the 
prostate cancer genes involved in androgen-independent prostate cancer. Some genes are 

20 expressed early in the time course following ablation therapy, then drop off in expression, 
and then express again with emergence of androgen-independence (hi-lo-lo-hi pattern in 
Table 2 A). Other genes are expressed early in the time course following ablation therapy, 
then drop off in expression, and do not express again with emergence of androgen- 
independence (hi-lo-lo-lo and hi-hi-lo-lo pattern in Table 2A). Still other genes are not 

25 expressed early in the time course, but express only with emergence of androgen- 
independence (lo-lo-lo-hi pattern in Table 2A). Other genes are not expressed early in the 
time course, but then express as androgen is withdrawn and continue to express with 
emergence of androgen-independence (lo-lo-hi-hi pattern in Table 2A). Finally, some genes 
are not expressed early in the time course, but then express as androgen is withdrawn and 

30 drop off again with emergence of androgen-independence (lo-lo-hi-lo pattern in Table 2A). 
Thus, the data suggest that different antigens are expressed in quiescent cells (during 
androgen withdrawal) and actively dividing androgen-independent prostate cancer cells. 
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Effective therapy to combat androgen-independent prostate cancer requires that the 
timing of therapy coincide with expression of the target genes. Patients can be monitored for 
the expression of certain diagnostic antigens that indicate the presence of quiescent cells or 
which indicate the transition to actively dividing androgen-independent prostate cancer cells. 
5 Thus, therapy to combat androgen-independent prostate cancer should begin at some time 
following androgen ablation therapy, depending on the particular target Typically the 
transition from quiescence to actively dividing androgen-independent prostate cancer occurs 
between 6-24 months following androgen ablation therapy. Thus, preferred time periods for 
the therapies of the invention are as follows: 

10 Expression levels of genes in normal tissue (i.e., not undergoing prostate cancer) and 

in prostate cancer tissue (and in some cases, for varying severities of prostate cancer that 
relate to prognosis, as outlined below) or in non-malignant disease are evaluated to provide 
expression profiles. An expression profile of a particular cell state or point of development is 
essentially a "fingerprint" of the state. While two states may have a particular gene similarly 

15 expressed, the evaluation of a number of genes simultaneously allows the generation of a 
gene expression profile that is reflective of the state of the cell. By comparing expression 
profiles of cells in different states, information regarding which genes are important 
(including both up- and down-regulation of genes) in each of these states is obtained. Then, 
diagnosis may be performed or confirmed to determine whether a tissue sample has the gene 

20 expression profile of normal or cancerous tissue. This will provide for molecular diagnosis 
of related conditions. 

"Differential expression/' or grammatical equivalents as used herein, refers to 
qualitative or quantitative differences in the temporal and/or cellular gene expression patterns 
within and among cells and tissue. Thus, a differentially expressed gene can qualitatively 

25 have its expression altered, including an activation or inactivation, in, e.g., normal versus 
prostate cancer tissue. Genes may be turned on or turned off in a particular state, relative to 
another state thus permitting comparison of two or more states. A qualitatively regulated 
gene will exhibit an expression pattern within a state or cell type which is detectable by 
standard techniques. Some genes will be expressed in one state or cell type, but not in both. 

30 Alternatively, the difference in expression may be quantitative, e.g., in that expression is 
increased or decreased; i.e., gene expression is either upregulated, resulting in an increased 
amount of transcript, or downregulated, resulting in a decreased amount of transcript. The 
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degree to which expression differs need only be large enough to quantify via standard 
characterization techniques as outlined below, such as by use of Affymetrix GeneChip™ 
expression arrays, Lockhart (1996) Nature Biotechnology 14:1675-1680, hereby expressly 
incorporated by reference. Other techniques include, but are not limited to, quantitative 
5 reverse transcriptase PCR, northern analysis and RNase protection. As outlined above, 
preferably the change in expression (i.e., upregulation or downregulation) is at least about 
50%, more preferably at least about 100%, more preferably at least about 150%, more 
preferably at least about 200%, with from 300 to at least 1000% being especially preferred. 
Evaluation may be at the gene transcript, or the protein level. The amount of gene 

1 0 expression may be monitored using nucleic acid probes to the DNA or RNA equivalent of the 
gene transcript, and the quantification of gene expression levels, or, alternatively, the final 
gene product itself (protein) can be monitored, e.g., with antibodies to the prostate cancer 
protein and standard immunoassays (ELIS As, etc.) or other techniques, including mass 
spectroscopy assays, 2D gel electrophoresis assays, etc. Proteins corresponding to prostate 

15 cancer genes, i.e., those identified as being important in a prostate cancer or disease 
phenotype, can be evaluated in a prostate cancer diagnostic test. 

In a preferred embodiment, gene expression monitoring is performed simultaneously 
on a number of genes. Multiple protein expression monitoring can be performed as well. 
Similarly, these assays may be performed on an individual basis as well. 

20 In this embodiment, the prostate cancer nucleic acid probes are attached to biochips as 

outlined herein for the detection and quantification of prostate cancer sequences in a 
particular cell. The assays are further described below in the example. PCR techniques can 
be used to provide greater sensitivity. 

In a preferred embodiment nucleic acids encoding the prostate cancer protein are 

25 detected. Although DNA or RNA encoding the prostate cancer protein may be detected, of 
particular interest are methods wherein an mRNA encoding a prostate cancer protein is 
detected. Probes to detect mRNA can be a nucleotide/deoxynucleotide probe that is 
complementary to and hybridizes with the mRNA and includes, but is not limited to, 
oligonucleotides, cDNA, or RNA. Probes also should contain a detectable label, as defined 

30 herein. In one method the mRNA is detected after immobilizing the nucleic acid to be 

examined on a solid support such as nylon membranes and hybridizing the probe with the 
sample. Following washing to remove the non-specifically bound probe, the label is 
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detected. In another method detection of the mRNA is performed in situ (in situ 
hybridization or ISH). In this method permeabilized cells or tissue samples are contacted 
with a detectably labeled nucleic acid probe for sufficient time to allow the probe to hybridize 
with the target mRNA. Following washing to remove the non-specifically bound probe, the 
label is detected. For example a digoxygenin labeled riboprobe (RNA probe) that is 
complementary to the mRNA encoding a prostate cancer protein is detected by binding the 
digoxygenin with an anti-digoxygenin secondary antibody and developed with nitro blue 
tetrazolium and 5-bromo-4-cUoro-3-indoyl phosphate. 

In a preferred embodiment, various proteins from the three classes of proteins as 
described herein (secreted, transmembrane, or intracellular proteins) are used in diagnostic 
assays. The prostate cancer proteins, antibodies, nucleic acids, modified proteins and cells 
containing prostate cancer sequences are used in diagnostic assays. Such may evaluate 
tissues, e.g., immunohistochemistry, or evaluate body fluids, e.g., blood. The detection may 
be direct of cells, or indirect, e.g., of products from cells. This can be performed on an 
individual gene or corresponding polypeptide level. In a preferred embodiment, the 
expression profiles are used, preferably in conjunction with high throughput screening 
techniques to allow monitoring for expression profile genes and/or corresponding 
polypeptides. 

As described and defined herein, prostate cancer proteins, including intracellular, 
transmembrane, or secreted proteins, find use as prognostic or diagnostic markers of prostate 
cancer or other prostate conditions. Detection of these proteins in putative prostate cancer 
tissue allows for detection, diagnosis, or prognosis of prostate proliferative disorders 
(malignant and non-malignant) including benign prostate hyperplasia (BPH) and cancer, and 
prostatitis. Diagnosis may also assist in selecting a therapeutic strategy, e.g., based on 
expression profiles and/or comparison to archival samples. In one embodiment, antibodies 
are used to detect prostate cancer proteins, directly or indirectly. A preferred method 
separates proteins from a sample by electrophoresis on a gel (typically a denaturing and 
reducing protein gel, but may be another type of gel, including isoelectric focusing gels and 
the like). Following separation of proteins, the prostate cancer protein is detected, e.g., by 
immunoblotting with antibodies raised against the prostate cancer protein. Methods of 
immunoblotting are well known to those of ordinary skill in the art. 
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In another preferred method, antibodies to the prostate cancer protein find use in in 
situ imaging techniques, e.g., in histology and/or in immunohistochemistry (e.g., Asai (ed. 
1993) Methods in Cell Biology: Antibodies in Cell Biology (vol. 37) Academic Press. In this 
method cells are contacted with from one to many antibodies to the prostate cancer protein(s). 
5 Following washing to remove non-specific antibody binding, the presence of the antibody or 
antibodies is detected. In one embodiment the antibody is detected by incubating with a 
secondary antibody that contains a detectable label. In another method the primary antibody 
to the prostate cancer protein(s) contains a detectable label, e.g., an enzyme marker that can 
act on a substrate. In another preferred embodiment each one of multiple primary antibodies 
10 contains a distinct and detectable label. This method finds particular use in simultaneous 
screening for a plurality of prostate cancer proteins. As will be appreciated by one of 
ordinary skill in the art, many other histological imaging techniques are also provided by the 
invention. 

In a preferred embodiment the label is detected in a fluorometer which has the ability 

15 to detect and distinguish emissions of different wavelengths. In addition, a fluorescence 
activated cell sorter (FACS) can be used in the method. 

In another preferred embodiment, antibodies find use in diagnosing prostate cancer 
from blood, serum, plasma, stool, and other samples. Such samples, therefore, are useful as 
samples to be probed or tested for the presence of prostate cancer proteins, which may be 

20 diagnostic of prostate conditions beyond cancer, e.g., BPH. Antibodies can be used to detect 
a prostate cancer protein by previously described immunoassay techniques including ELIS A, 
immunoblotting (western blotting), immunoprecipitation, BIACORE technology, and the 
like. Conversely, the presence of antibodies may indicate an immune response against an 
endogenous prostate cancer protein. 

25 In a preferred embodiment, in situ hybridization of labeled prostate cancer nucleic 

acid probes to tissue arrays is done. For example, arrays of tissue samples, including prostate 
cancer tissue and/or normal tissue, are made. In situ hybridization (see, e.g., Ausubel, supra) 
is then performed. When comparing the fingerprints between an individual and a standard, 
the skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It 

30 is further understood that the genes which indicate the diagnosis may differ from those which 
indicate the prognosis and molecular profiling of the condition of the cells may lead to 
distinctions between responsive or refractory conditions or may be predictive of outcomes. 
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In a preferred embodiment, the prostate cancer proteins, antibodies, nucleic acids, 
modified proteins, and cells containing prostate cancer sequences are used in prognosis 
assays. As above, gene expression profiles can be generated that correlate to prostate cancer 
or other prostate disorders, in terms of useful aspects of clinical condition, pathology, or other 
5 information which may be relevant to long term prognosis. Again, this may be done on either 
a protein or gene level, with the use of genes being preferred. Single or multiple genes may 
be useful in various combinations. As above, prostate cancer probes may be attached to 
biochips for the detection and quantification of prostate cancer sequences in a tissue or 
patient The assays proceed as outlined above for diagnosis. PCR method may provide more 
10 sensitive and accurate quantification. 

Assays for therapeutic compounds 

In a preferred embodiment members of the proteins, nucleic acids, and antibodies as 
described herein are used in drug screening assays. The prostate cancer proteins, antibodies, 

15 nucleic acids, modified proteins, and cells containing prostate cancer sequences are used in 
drug screening assays or by evaluating the effect of drug candidates on a "gene expression 
profile" or expression profile of polypeptides. In a preferred embodiment, the expression 
profiles are used, preferably in conjunction with high throughput screening techniques to 
allow monitoring for expression profile genes after treatment with a candidate agent (e.g., 

20 Zlokamik, et al. (1998) Science 279:84-88; Heid (1996) Genome Res. 6:986-94). 

In a preferred embodiment, the prostate cancer proteins, antibodies, nucleic acids, 
modified proteins, and cells containing the native or modified prostate cancer proteins are 
used in screening assays. That is, the present invention provides novel methods for screening 
for compositions which modulate the prostate cancer phenotype or an identified physiological 

25 function of a prostate cancer protein. As above, this can be done on an individual gene level 
or by evaluating the effect of drug candidates on a "gene expression profile". In a preferred 
embodiment, the expression profiles are used, preferably in conjunction with high throughput 
screening techniques to allow monitoring for expression profile genes after treatment with a 
candidate agent, see Zlokamik, supra. 

30 Having identified the differentially expressed genes herein, a variety of assays may be 

executed. In a preferred embodiment, assays may be run on an individual gene or protein 
level. That is, having identified a particular gene as up regulated in prostate cancer, test 
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compounds can be screened for the ability to modulate gene expression or for binding to the 
prostate cancer protein. "Modulation" thus includes both an increase and a decrease in gene 
expression. The preferred amount of modulation will depend on the original change of the 
gene expression in normal versus tissue undergoing prostate cancer, with changes of at least 
5 10%, preferably 50%, more preferably 100-300%, and in some embodiments 300-1000% or 
greater. Thus, if a gene exhibits a 4-fold increase in prostate cancer tissue compared to 
normal tissue, a decrease of about four-fold is often desired; similarly, a 10-fold decrease in 
prostate cancer tissue compared to normal tissue often provides a target value of a 10-fold 
increase in expression to be induced by the test compound. 
1 0 The amount of gene expression may be monitored using nucleic acid probes and the 

quantification of gene expression levels, or, alternatively, the gene product itself can be 
monitored, e.g., through the use of antibodies to the prostate cancer protein and standard 
immunoassays. Proteomics and separation techniques may also allow quantification of 
expression. 

15 hi a preferred embodiment, gene expression or protein monitoring of a number of 

entities, i.e., an expression profile, is monitored simultaneously. Such profiles will typically 

involve a plurality of those entities described herein. 

In this embodiment, the prostate cancer nucleic acid probes are attached to biochips as 

outlined herein for the detection and quantification of prostate cancer sequences in a 
20 particular celL Alternatively, PCR may be used. Thus, a series, e.g., of microliter plate, may 

be used with dispensed primers in desired wells. A PCR reaction can then be performed and 

analyzed for each well. 

Expression monitoring can be performed to identify compounds that modify the 

expression of one or more prostate cancer-associated sequences, e.g., a polynucleotide 
25 sequence set out in Tables 1 A-4. Generally, in a preferred embodiment, a test modulator is 

added to the cells prior to analysis. Moreover, screens are also provided to identify agents 

that modulate prostate cancer, modulate prostate cancer proteins, bind to a prostate cancer 

protein, or interfere with the binding of a prostate cancer protein and an antibody or other 

binding partner. 

30 The term "test compound" or "drug candidate" or "modulator" or grammatical 

equivalents as used herein describes a molecule, e.g., protein, oligopeptide, small organic 
molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or 
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indirectly alter the prostate cancer phenotype or the expression of a prostate cancer sequence, 
e.g., a nucleic acid or protein sequence. In preferred embodiments, modulators alter 
expression profiles, or expression profile nucleic acids or proteins provided herein. In one 
embodiment, the modulator suppresses a prostate cancer phenotype, e.g., to a normal or non- 
5 malignant tissue fingerprint. In another embodiment, a modulator induced a prostate cancer 
phenotype. Generally, a plurality of assay mixtures are run in parallel with different agent 
concentrations to obtain a differential response to the various concentrations. Typically, one 
of these concentrations serves as a negative control, i.e., at zero concentration or below the 
level of detection. 

10 Drug candidates encompass numerous chemical classes, though typically they are 

organic molecules, preferably small organic compounds having a molecular weight of more 
than 100 and less than about 2,500 daltons. Preferred small molecules are less than 2000, or 
less than 1500, or less than 1000, or less than 500 D. Candidate agents comprise functional 
groups necessary for structural interaction with proteins, particularly hydrogen bonding, and 

15 typically include at least an amine, carbonyl, hydroxyl or carboxyl group, preferably at least 
two of the functional chemical groups. The candidate agents often comprise cyclical carbon 
or heterocyclic structures and/or aromatic or polyaromatic structures substituted with one or 
more of the above functional groups. Candidate agents are also found among biomolecules 
including peptides, saccharides, fatty acids, steroids, purines, pyrimidines, derivatives, 

20 structural analogs, or combinations thereof. Particularly preferred are peptides. 

In one aspect, a modulator will neutralize the effect of a prostate cancer protein. By 
"neutralize" is meant that activity of a protein is inhibited or blocked and the consequent 
effect on the cell. 

In certain embodiments, combinatorial libraries of potential modulators will be 
25 screened for an ability to bind to a prostate cancer polypeptide or to modulate activity. 

Conventionally, new chemical entities with useful properties are generated by identifying a 
chemical compound (called a "lead compound") with some desirable property or activity, 
e.g., inhibiting activity, creating variants of the lead compound, and evaluating the property 
and activity of those variant compounds. Often, high throughput screening (HTS) methods 
30 are employed for such an analysis. 

In one preferred embodiment, high throughput screening methods involve providing a 
library containing a large number of potential therapeutic compounds (candidate 

63 



BNSOOCID: <WO 02098358A2_I_> 



WO 02/098358 



PCT/US02/17594 



compounds). Such "combinatorial chemical libraries" are then screened in one or more 
assays to identify those library members (particular chemical species or subclasses) that 
display a desired characteristic activity. The compounds thus identified can serve as 
conventional "lead compounds" or can themselves be used as potential or actual therapeutics. 
5 A combinatorial chemical library is a collection of diverse chemical compounds 

generated by either chemical synthesis or biological synthesis by combining a number of 
chemical "building blocks" such as reagents. For example, a linear combinatorial chemical 
library, such as a polypeptide (e.g., mutein) library, is formed by combining a set of chemical 
building blocks called amino acids in most every possible way for a given compound length 

10 (i.e., the number of amino acids in a polypeptide compound). Millions of chemical 

compounds can be synthesized through such combinatorial mixing of chemical building 
blocks. Gallop, et al. (1994) J. Med. Chem. 37:1233-1251. 

Preparation and screening of combinatorial chemical libraries is well known to those 
of skill in the art. Such combinatorial chemical libraries include, but are not limited to, 

15 peptide libraries (see, e.g., U.S. Patent No. 5,010,175, Furka (1991) Pent. Prot. Res. 37:487- 
493, Houghton, et al. (1991) Nature. 354:84-88), peptoids (PCT Publication No WO 
91/19735), encoded peptides (PCT Publication WO 93/20242), random bio-oligomers (PCT 
Publication WO 92/00091), benzodiazepines (U.S. Pat. No. 5,288,514), diversomers such as 
hydantoins, benzodiazepines and dipeptides (Hobbs, et al. (1993) Proc. Nat. Acad. Sci. USA 

20 90:6909-6913), vinylogous polypeptides (Hagihara, et al. (1992) J. Amer. Chem. Soc. 
114:656S-xxx), nonpeptidal peptidomimetics with a Beta-D-Glucose scaffolding 
(Hirschmann, et al. (1992) J. Amer. Chem. Soc. 114:9217-9218), analogous organic 
syntheses of small compound libraries (Chen, et al. (1994) J. Amer. Chem. Soc. 116:2661- 
xxx), oligocarbamates (Cho, et al. (1993) Science 261:1303-1305), and/or peptidyl 

25 phosphonates (Campbell, et al. (1994) J. Org. Chem. 59:658-xxx). See, generally, Gordon, et 
al. (1994) J. Med. Chem. 37:1385-1401), nucleic acid libraries (see, e.g., Stratagene, Corp.), 
peptide nucleic acid libraries (see, e.g., U.S. Patent 5,539,083), antibody libraries (see, e.g., 
Vaughn, et al. (1996) Nature Biotechnology 14:309-314, and PCT/US96/10287), 
carbohydrate libraries (see, e.g., Liang, et al. (1996) Science 274:1520-1522, and U.S. Patent 

30 No. 5,593,853), and small organic molecule libraries (see, e.g., benzodiazepines, Baum 

(1993) C&EN. Jan 18, page 33; isoprenoids, U.S. Patent No. 5,569,588; thiazolidinones and 
metathiazanones, U.S. Patent No. 5,549,974; pyrrolidines, U.S. Patent Nos. 5,525,735 and 
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5,519,134; morpholino compounds, U.S. Patent No. 5,506,337; benzodiazepines, U.S. Patent 
No. 5,288,514; and the like). 

Devices for the preparation of combinatorial libraries are commercially available (see, 
e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Symphony, Rainin, 
5 Woburn, MA, 433A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, Bedford, 
MA). 

A number of well known robotic systems have also been developed for solution phase 
chemistries. These systems include automated workstations like the automated synthesis 
apparatus developed by Takeda Chemical Industries, LTD. (Osaka, Japan) and many robotic 

10 systems utilizing robotic arms (Zymate n, Zymark Corporation, Hopkinton, Mass.; Orca, 

Hewlett-Packard, Palo Alto, Calif.), which mimic the manual synthetic operations performed 
by a chemist. Many of the above devices are suitable for use with the present invention. The 
nature and implementation of modifications to these devices (if any) so that they can operate 
as discussed herein will be apparent to persons skilled in the relevant art. In addition, 

15 numerous combinatorial libraries are themselves commercially available (see, e.g., 

ComGenex, Princeton, N.J., Asinex, Moscow, Ru, Tripos, Inc., St. Louis, MO, ChemStar, 
Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek Biosciences, Columbia, MD, 
etc.). 

The assays to identify modulators are amenable to high throughput screening. 
20 Preferred assays thus detect enhancement or inhibition of prostate cancer gene transcription, 
inhibition or enhancement of polypeptide expression, and inhibition or enhancement of 
polypeptide activity. 

High throughput assays for the presence, absence, quantification, or other properties 
of particular nucleic acids or protein products are well known to those of skill in the art. 

25 Similarly, binding assays and reporter gene assays are similarly well known. Thus, e.g., U.S. 
Patent No. 5,559,410 discloses high throughput screening methods for proteins, U.S. Patent 
No. 5,585,639 discloses high throughput screening methods for nucleic acid binding (i.e., in 
arrays), while U.S. Patent Nos. 5,576,220 and 5,541,061 disclose high throughput methods of 
screening for ligand/antibody binding. 

30 In addition, high throughput screening systems are commercially available (see, e.g., 

Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 
Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). These systems 
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typically automate entire procedures, including sample and reagent pipetting, liquid 
dispensing, timed incubations, and final readings of the microplate in detector(s) appropriate 
for the assay. These configurable systems provide high throughput and rapid start up as well 
as a high degree of flexibility and customization. The manufacturers of such systems provide 
5 detailed protocols for various high throughput systems. Thus, e.g., Zymark Corp. provides 
technical bulletins describing screening systems for detecting the modulation of gene 
transcription, ligand binding, and the like. 

In one embodiment, modulators are proteins, often naturally occurring proteins or 
fragments of naturally occurring proteins. Thus, e.g., cellular extracts containing proteins, or 

10 random or directed digests of proteinaceous cellular extracts, may be used, hi this way 

libraries of proteins may be made for screening in the methods of the invention. Particularly 
preferred in this embodiment are libraries of bacterial, fungal, viral, and mammalian proteins, 
with the latter being preferred, and human proteins being especially preferred. Particularly 
useful test compound will be directed to the class of proteins to which the target belongs, e.g., 

15 substrates for enzymes or ligands and receptors. 

In a preferred embodiment, modulators are peptides of from about 5 to about 30 
amino acids, with from about 5 to about 20 amino acids being preferred, and from about 7 to 
about 1 5 being particularly preferred. The peptides may be digests of naturally occurring 
proteins as is outlined above, random peptides, or "biased" random peptides. By 

20 "randomized" or grammatical equivalents herein is meant that each nucleic acid and peptide 
consists of essentially random nucleotides and amino acids, respectively. Since generally 
these random peptides (or nucleic acids, discussed below) are chemically synthesized, they 
may typically incorporate any nucleotide or amino acid at any position. The synthetic 
process can be designed to generate randomized proteins or nucleic acids, to allow the 

25 formation of all or most of the possible combinations over the length of the sequence, thus 
forming a library of randomized candidate bioactive proteinaceous agents. 

In one embodiment, the library is fully randomized, with no sequence preferences or 
constants at any position, hi a preferred embodiment, the library is biased. That is, some 
positions within the sequence are either held constant, or are selected from a limited number 

30 of possibilities. For example, in a preferred embodiment, the nucleotides or amino acid 

residues are randomized within a defined class, e.g., of hydrophobic amino acids, hydrophilic 
residues, sterically biased (either small or large) residues, towards the creation of nucleic acid 
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binding domains, the creation of cysteines, for cross-linking, prolines for SH-3 domains, 
serines, threonines, tyrosines, or histidines for phosphorylation sites, etc., or to purines, etc. 

Modulators of prostate cancer can also be nucleic acids, as defined above. 

As described above generally for proteins, nucleic acid modulating agents may be 
5 naturally occurring nucleic acids, random nucleic acids, or "biased" random nucleic acids. 
For example, digests of prokaryotic or eukaryotic genomes maybe used as is outlined above 
for proteins. 

In a preferred embodiment, the candidate compounds are organic chemical moieties, a 
wide variety of which are available in the literature. 

1 0 After the candidate agent has been added and the cells allowed to incubate for some 

period of time, the sample containing a target sequence to be analyzed is added to the 
biochip. If required, the target sequence is prepared using known techniques. For example, 
the sample maybe treated to lyse the cells, using known lysis buffers, electroporation, etc., 
with purification and/or amplification such as PCR performed as appropriate. For example, 

15 an in vitro transcription with labels covalently attached to the nucleotides is performed. 
Generally, the nucleic acids are labeled with biotin-FITC or PE, or with cy3 or cy5. 

In a preferred embodiment, the target sequence is labeled with, e.g., a fluorescent, a 
chemiluminescent, a chemical, or a radioactive signal, to provide a means of detecting the 
target sequence's specific binding to a probe. The label also can be an enzyme, such as, 

20 alkaline phosphatase or horseradish peroxidase, which when provided with an appropriate 
substrate produces a product that can be detected. Alternatively, the label can be a labeled 
compound or small molecule, such as an enzyme inhibitor, that binds but is not catalyzed or 
altered by the enzyme. The label also can be a moiety or compound, such as, an epitope tag 
or biotin which specifically binds to streptavidin. For the example of biotin, the streptavidin 

25 is labeled as described above, thereby, providing a detectable signal for the bound target 
sequence. Unbound labeled streptavidin is typically removed prior to analysis. 

As will be appreciated by those in the art, these assays can be direct hybridization 
assays or can comprise "sandwich assays", which include the use of multiple probes, as is 
generally outlined in U.S. Patent Nos. 5,681,702, 5,597,909, 5,545,730, 5,594,117, 

30 5,591,584, 5,571,670, 5,580,731, 5,571,670, 5,591,584, 5,624,802, 5,635,352, 5,594,118, 
5,359,100, 5,124,246, and 5,681,697, each of which is hereby incorporated by reference. In 
this embodiment, in general, the target nucleic acid is prepared as outlined above, and then 
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added to the biochip comprising a plurality of nucleic acid probes, under conditions that 
allow the formation of a hybridization complex. 

A variety of hybridization conditions may be used in the present invention, including 
high, moderate, and low stringency conditions as outlined above. The assays are generally 
5 run under stringency conditions which allows formation of the label probe hybridization 
complex only in the presence of target Stringency can be controlled by altering a step 
parameter that is a thermodynamic variable, including, but not limited to, temperature, 
formamide concentration, salt concentration, chaotropic salt concentration pH, organic 
solvent concentration, etc. 

10 These parameters may also be used to control non-specific binding, as is generally 

outlined in U.S. Patent No. 5,681,697. Thus it may be desirable to perform certain steps at 
higher stringency conditions to reduce non-specific binding. 

The reactions outlined herein may be accomplished in a variety of ways. Components 
of the reaction may be added simultaneously, or sequentially, in different orders, with 

1 5 preferred embodiments outlined below. In addition, the reaction may include a variety of 
other reagents. These include salts, buffers, neutral proteins, e.g., albumin, detergents, etc., 
which may be used to facilitate optimal hybridization and detection, and/or reduce non- 
specific or background interactions. Reagents that otherwise improve the efficiency of the 
assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may also be 

20 used as appropriate, depending on the sample preparation methods and purity of the target. 

The assay data are analyzed to determine the expression levels, and changes in 
expression levels as between states, of individual genes, forming a gene expression profile. 

Screens are performed to identify modulators of the prostate cancer or related 
phenotype. In one embodiment, screening is performed to identify modulators that can 

25 induce or suppress a particular expression profile, thus preferably generating the associated 
phenotype. In another embodiment, e.g., for diagnostic applications, having identified 
differentially expressed genes important in a particular state, screens can be performed to 
identify modulators that alter expression of individual genes. In an another embodiment, 
screening is performed to identify modulators that alter a biological function of the 

30 expression product of a differentially expressed gene. Again, having identified the 

importance of a gene in a particular state, screens are performed to identify agents that bind 
and/or modulate the biological activity of the gene product. 
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In addition screens can be done for genes that are induced in response to a candidate 
agent. After identifying a modulator based upon its ability to suppress a prostate cancer 
expression pattern leading to a normal expression pattern, or to modulate a single prostate 
cancer gene expression profile so as to mimic the expression of the gene from normal tissue, 
5 a screen as described above can be performed to identify genes that are specifically 

modulated in response to the agent. Comparing expression profiles between normal tissue 
and agent treated prostate cancer tissue reveals genes that are not expressed in normal tissue 
or prostate cancer tissue, but are expressed in agent treated tissue. These agent-specific 
sequences can be identified and used by methods described herein for prostate cancer genes 
10 or proteins. In particular these sequences and the proteins they encode find use in marking or 
identifying agent treated cells. In addition, antibodies can be raised against the agent induced 
proteins and used to target novel therapeutics to the treated prostate cancer tissue sample. 

Thus, in one embodiment, a test compound is administered to a population of prostate 
cancer cells, that have an associated prostate cancer expression profile. By "administration" 
15 or "contacting" herein is meant that the candidate agent is added to the cells in such a manner 
as to allow the agent to act upon the cell, whether by uptake and intracellular action, or by 
action at the cell surface. In some embodiments, nucleic acid encoding a proteinaceous 
candidate agent (e.g., a peptide) may be put into a viral construct such as an adenoviral or 
retroviral construct, and added to the cell, such that expression of the peptide agent is 
20 accomplished, e.g., PCT US97/01019. Regulatable gene therapy systems can also be used. 

Once the test compound has been administered to the cells, the cells can be washed if 
desired and are allowed to incubate under preferably physiological conditions for some 
period of time. The cells are then harvested and a new gene expression profile is generated, 
as outlined herein. 

25 Thus, e.g., prostate cancer or non-malignant tissue may be screened for agents that 

modulate, e.g., induce or suppress the prostate cancer or related phenotype. A change in at 
least one gene, preferably many, of the expression profile indicates that the agent has an 
effect on prostate cancer activity. By defining such a signature for the prostate cancer 
phenotype, screens for new drugs that alter the phenotype can be devised. With this 

30 approach, the drug target need not be known and need not be represented in the original 

expression screening platform, nor does the level of transcript for the target protein need to 
change. 
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In a preferred embodiment, as outlined above, screens may be done on individual 
genes and gene products (proteins). That is, having identified a particular differentially 
expressed gene as important in a particular state, screening of modulators of either the 
expression of the gene or the gene product itself can be done. The gene products of 
5 differentially expressed genes are sometimes referred to herein as "prostate cancer proteins" 
or a "prostate cancer modulatory protein". The prostate cancer modulatory protein may be a 
fragment, or alternatively, be the fiill length protein to the fragment encoded by the nucleic 
acids of the Tables 1 A-4. Preferably, the prostate cancer modulatory protein is a fragment. 
In a preferred embodiment, the prostate cancer amino acid sequence which is used to 

10 determine sequence identity or similarity is encoded by a nucleic acid of Tables 1 A-4. In 
another embodiment, the sequences are naturally occurring allelic variants of a protein 
encoded by a nucleic acid of Tables 1A-4. In another embodiment, the sequences are 
sequence variants as further described herein. 

Preferably, the prostate cancer modulatory protein is a fragment of approximately 14 

15 to 24 amino acids long. More preferably the fragment is a soluble fragment. Preferably, the 
fragment includes a non-transmembrane region. In a preferred embodiment, the fragment has 
anN-terminal Cys to aid in solubility. In one embodiment, the C-tenninus of the fragment is 
kept as a free acid and the N-terminus is a free amine to aid in coupling, i.e., to cysteine. 

In one embodiment the prostate cancer proteins are conjugated to an immunogenic 

20 agent as discussed herein. In one embodiment the prostate cancer protein is conjugated to 
BSA. 

Measurements of prostate cancer polypeptide activity, or of prostate cancer or the 
prostate cancer phenotype can be performed using a variety of assays. For example, the 
effects of the test compounds upon the function of the prostate cancer polypeptides can be 

25 measured by examining parameters described above. A suitable physiological change that 
affects activity can be used to assess the influence of a test compound on the polypeptides of 
this invention. When the functional consequences are determined using intact cells or 
animals, one can also measure a variety of effects such as, in the case of prostate cancer 
associated with tumors, tumor growth, tumor metastasis, neovascularization, hormone 

30 release, transcriptional changes to both known and uncharacterized genetic markers (e.g., 
northern blots), changes in cell metabolism such as cell growth or pH changes, and changes 
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in intracellular second messengers such as cGMP. In the assays of the invention, a 
mammalian prostate cancer polypeptide is typically used, e.g., mouse, preferably human. 

Assays to identify compounds with modulating activity can be performed in vitro. 
For example, a prostate cancer polypeptide is first contacted with a potential modulator and 
5 incubated for a suitable amount of time, e.g., from 0.5 to 48 hours. In one embodiment, the 
prostate cancer polypeptide levels are determined in vitro by measuring the level of protein or 
mRNA. The level of protein is measured using immunoassays such as western blotting, 
ELIS A, and the like with an antibody that selectively binds to the prostate cancer polypeptide 
or a fragment thereof. For measurement of mRNA, amplification, e.g., using PCR, LCR, or 

10 hybridization assays, e.g., northern hybridization, RNAse protection, dot blotting, are 
preferred. The level of protein or mRNA is detected using directly or indirectly labeled 
detection agents, e.g., fluorescently or radioactively labeled nucleic acids, radioactively or 
enzymatically labeled antibodies, and the like, as described herein. 

Alternatively, a reporter gene system can be devised using the prostate cancer protein 

15 promoter operably linked to a reporter gene such as luciferase, green fluorescent protein, 
CAT, or p-gal. The reporter construct is typically transfected into a cell. After treatment 
with a potential modulator, the amount of reporter gene transcription, translation, or activity 
is measured according to standard techniques known to those of skill in the art. 

In a preferred embodiment, as outlined above, screens may be done on individual 

20 genes and gene products (proteins). That is, having identified a particular differentially 

expressed gene as important in a particular state, screening of modulators of the expression of 
the gene or the gene product itself can be done. The gene products of differentially expressed 
genes are sometimes referred to herein as "prostate cancer proteins." The prostate cancer 
protein may be a fragment, or alternatively, be the full length protein corresponding to a 

25 fragment shown herein. 

In one embodiment, screening for modulators of expression of specific genes is 
performed. Typically, the expression of only one or a few genes are evaluated. In another 
embodiment, screens are designed to first find compounds that bind to differentially 
expressed proteins. These compounds are then evaluated for the ability to modulate 

30 differentially expressed activity. Moreover, once initial candidate compounds are identified, 
variants can be further screened to better evaluate structure activity relationships. 
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In a preferred embodiment, binding assays are done. In general, purified or isolated 
gene product is used; that is, the gene products of one or more differentially expressed 
nucleic acids are made. For example, antibodies are generated to the protein gene products, 
and standard immunoassays are run to determine the amount of protein present. 
5 Alternatively, cells comprising the prostate cancer proteins can be used in the assays. 

Thus, in a preferred embodiment, the methods comprise combining a prostate cancer 
protein and a candidate compound, and determining the binding of the compound to the 
prostate cancer protein. Preferred embodiments utilize the human prostate cancer protein, 
although other mammalian proteins may also be used, e.g., for the development of animal 

10 models of human disease. In some embodiments, as outlined herein, variant or derivative 
prostate cancer proteins may be used. 

Generally, in a preferred embodiment of the methods herein, the prostate cancer 
protein or the candidate agent is non-diffusably bound to an insoluble support having isolated 
sample receiving areas (e.g., a microliter plate, an array, etc.). The insoluble supports may be 

1 5 made of a composition to which the compositions can be bound, is readily separated from 
soluble material, and is otherwise compatible with the overall method of screening. The 
surface of such supports may be solid or porous and of a convenient shape. Examples of 
suitable insoluble supports include microliter plates, arrays, membranes, and beads. These 
are typically made of glass, plastic (e.g., polystyrene), polysaccharides, nylon or 

20 nitrocellulose, teflon™, etc. Microtiter plates and arrays are especially convenient because a 
large number of assays can be carried out simultaneously, using small amounts of reagents 
and samples. The particular manner of binding of the composition should be compatible with 
the reagents and overall methods of the invention, maintain the activity of the composition, 
and be nondiffusable. Preferred methods of binding include the use of antibodies (which do 

25 not sterically block either the ligand binding site or activation sequence when the protein is 
bound to the support), direct binding to "sticky" or ionic supports, chemical crosslinking, the 
synthesis of the protein or agent on the surface, etc. Following binding of the protein or 
agent, excess unbound material is removed by washing. The sample receiving areas may 
then be blocked through incubation with bovine serum albumin (BS A), casein, or other 

30 innocuous protein or other moiety. 

In a preferred embodiment, the prostate cancer protein is bound to the support, and a 
test compound is added to the assay. Alternatively, the candidate agent is bound to the 
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support and the prostate cancer protein is added. Novel binding agents include specific 
antibodies, non-natural binding agents identified in screens of chemical libraries, peptide 
analogs, etc. Of particular interest are screening assays for agents that have a low toxicity for 
human cells. A wide variety of assays may be used for this purpose, including labeled in 
5 vitro protein-protein binding assays, electrophoretic mobility shift assays, immunoassays for 
protein binding, functional assays (phosphorylation assays, etc.) and the like. 

The determination of the binding of the test modulating compound to the prostate 
cancer protein may be done in a number of ways. In a preferred embodiment, the compound 
is labeled, and binding determined directly, e.g., by attaching all or a portion of the prostate 

10 cancer protein to a solid support, adding a labeled candidate agent (e.g., a fluorescent label), 
washing off excess reagent, and determining whether the label is present on the solid support. 
Various blocking and washing steps may be utilized as appropriate. 

In some embodiments, only one of the components is labeled, e.g., the proteins (or 
proteinaceous candidate compounds) can be labeled. Alternatively, more than one 

15 component can be labeled with different labels, e.g., l25 I for the proteins and a fluorophor for 
the compound. Proximity reagents, e.g., quenching or energy transfer reagents are also 
useful. 

In one embodiment, the binding of the test compound is determined by competitive 
binding assay. The competitor is a binding moiety known to bind to the target molecule (i.e., 

20 a prostate cancer protein), such as an antibody, peptide, binding partner, ligand, etc. Under 
certain circumstances, there may be competitive binding between the compound and the 
binding moiety, with the binding moiety displacing the compound. In one embodiment, the 
test compound is labeled. Either the compound, or the competitor, or both, is added first to 
the protein for a time sufficient to allow binding, if present. Incubations may be performed at 

25 a temperature which facilitates optimal activity, typically between 4 and 40° C. Incubation 
periods are typically optimized, e.g., to facilitate rapid high throughput screening. Typically 
between 0. 1 and 1 hour will be sufficient. Excess reagent is generally removed or washed 
away. The second component is then added, and the presence or absence of the labeled 
component is followed, to indicate binding. 

30 In a preferred embodiment, the competitor is added first, followed by the test 

compound. Displacement of the competitor is an indication that the test compound is binding 
to the prostate cancer protein and thus is capable of binding to, and potentially modulating, 
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the activity of the prostate cancer protein. In this embodiment, either component can be 
labeled. Thus, e.g., if the competitor is labeled, the presence of label in the wash solution 
indicates displacement by the agent. Alternatively, if the test compound is labeled, the 
presence of the label on the support indicates displacement. 
5 In an alternative embodiment, the test compound is added first, with incubation and 

washing, followed by the competitor. The absence of binding by the competitor may indicate 
that the test compound is bound to the prostate cancer protein with a higher affinity. Thus, if 
the test compound is labeled, the presence of the label on the support, coupled with a lack of 
competitor binding, may indicate that the test compound is capable of binding to the prostate 

1 0 cancer protein. 

In a preferred embodiment, the methods comprise differential screening to identity 
agents that are capable of modulating the activity of the prostate cancer proteins. In this 
embodiment, the methods comprise combining a prostate cancer protein and a competitor in a 
first sample. A second sample comprises a test compound, a prostate cancer protein, and a 

15 competitor. The binding of the competitor is determined for both samples, and a change, or 
difference in binding between the two samples indicates the presence of an agent capable of 
binding to the prostate cancer protein and potentially modulating its activity. That is, if the 
binding of the competitor is different in the second sample relative to the first sample, the 
agent is capable of binding to the prostate cancer protein. 

20 Alternatively, differential screening is used to identify drug candidates that bind to the 

native prostate cancer protein, but cannot bind to modified prostate cancer proteins. The 
structure of the prostate cancer protein may be modeled, and used in rational drug design to 
synthesize agents that interact with that site. Drug candidates that affect the activity of a 
prostate cancer protein are also identified by screening drugs for the ability to either enhance 

25 or reduce the activity of the protein. 

Positive controls and negative controls may be used in the assays. Preferably control 
and test samples are performed in at least triplicate to obtain statistically significant results. 
Incubation of samples is for a time sufficient for the binding of the agent to the protein. 
Following incubation, samples are washed free of non-specifically bound material and the 

30 amount of bound, generally labeled agent determined. For example, where a radiolabel is 
employed, the samples may be counted in a scintillation counter to determine the amount of 
bound compound. 
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A variety of other reagents may be included in the screening assays. These include 
reagents like salts, neutral proteins, e.g., albumin, detergents, etc., which may be used to 
facilitate optimal protein-protein binding and/or reduce non-specific or background 
interactions. Also reagents that otherwise improve the efficiency of the assay, such as 
5 protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used. The mixture 
of components may be added in an order that provides for the requisite binding. 

In a preferred embodiment, the invention provides methods for screening for a 
compound capable of modulating the activity of a prostate cancer protein. The methods 
comprise adding a test compound, as defined above, to a cell comprising prostate cancer 
10 proteins. Preferred cell types include almost any cell. The cells contain a recombinant 

nucleic acid that encodes a prostate cancer protein. In a preferred embodiment, a library of 
candidate agents are tested on a plurality of cells. 

In one aspect, the assays are evaluated in the presence or absence or previous or 
subsequent exposure of physiological signals, e.g., hormones, antibodies, peptides, antigens, 
15 cytokines, growth factors, action potentials, pharmacological agents including 

chemotherapeutics, radiation, carcinogenics, or other cells (e.g., cell-cell contacts). In 
another example, the determinations are determined at different stages of the cell cycle 
process. 

In this way, compounds that modulate prostate cancer agents are identified. 
20 Compounds with pharmacological activity are able to enhance or interfere with the activity of 
the prostate cancer protein. Once identified, similar structures are evaluated to identify 
critical structural feature of the compound. 

In one embodiment, a method of inhibiting prostate cancer cell division is provided. 
The method comprises administration of a prostate cancer inhibitor. In another embodiment, 
25 a method of inhibiting prostate cancer or other prostate proliferative condition is provided. 
The method comprises administration of a prostate cancer inhibitor. In a further 
embodiment, methods of treating cells or individuals with prostate cancer are provided. The 
method comprises administration of a prostate cancer inhibitor. 

La one embodiment, a prostate cancer inhibitor is an antibody as discussed above. In 
30 another embodiment, the prostate cancer inhibitor is an antisense molecule. 

A variety of cell growth, proliferation, and metastasis assays are known to those of 
skill in the art, as described below. 
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Soft agar growth or colony formation in suspension 

Normal cells require a solid substrate to attach and grow. When the cells are 
transformed, they lose this phenotype and grow detached from the substrate. For example, 
transformed cells can grow in stirred suspension culture or suspended in semi-solid media, 
5 such as semi-solid or soft agar. The transformed cells, when transfected with tumor 

suppressor genes, regenerate normal phenotype and require a solid substrate to attach and 
grow. Soft agar growth or colony formation in suspension assays can be used to identify 
modulators of prostate cancer sequences, which when expressed in host cells, inhibit 
abnormal cellular proliferation and transformation. A therapeutic compound would reduce or 

10 eliminate the host cells' ability to grow in stirred suspension culture or suspended in semi- 
solid media, such as semi-solid or soft. 

Techniques for soft agar growth or colony formation in suspension assays are 
described in Freshney (1994) Culture of Animal Cells a Manual of Basic Technique 3d ed. 
Wiley-Liss, herein incorporated by reference. See also, the methods section of Garkavtsev, et 

1 5 al. (1 996), supra, herein incorporated by reference. 
Contact inhi bition and density limitation of growth 

Normal cells typically grow in a flat and organized pattern in a petri dish until they 
touch other cells. When the cells touch one another, they are contact inhibited and stop 
growing. When cells are transformed, however, the cells are not contact inhibited and 

20 continue to grow to high densities in disorganized foci. Thus, the transformed cells grow to a 
higher saturation density than normal cells. This can be detected morphologically by the 
formation of a disoriented monolayer of cells or rounded cells in foci within the regular 
pattern of normal surrounding cells. Alternatively, labeling index with ( 3 H)-thymidine at 
saturation density can be used to measure density limitation of growth. See Freshney (1994), 

25 supra. The transformed cells, when transfected with tumor suppressor genes, regenerate a 
normal phenotype and become contact inhibited and would grow to a lower density. 

In this assay, labeling index with ( 3 H)-thymidine at saturation density is a preferred 
method of measuring density limitation of growth. Transformed host cells are transfected 
with a prostate cancer-associated sequence and are grown for 24 hours at saturation density in 

30 non-limiting medium conditions. The percentage of cells labeling with ( 3 H)-thymidine is 
determined autoradiographically. See, Freshney (1994), supra. 
Growth factor or serum dependence 
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Transformed cells have a lower serum dependence than their normal counterparts 
(see, e.g., Temin (1966) J. Natl. Cancer Insti. 37:167-175; Eagle, et al. (1970) J. Exp. Med. 
131 : 836-879); Freshney, supra. This is in part due to release of various growth factors by the 
transformed cells. Growth factor or serum dependence of transformed host cells can be 
5 compared with that of control. 
Tumor specific markers levels 

Tumor cells release an increased amount of certain factors (hereinafter 'tumor 
specific markers") than their normal counterparts. For example, plasminogen activator (PA) 
is released from human glioma at a higher level than from normal brain cells (see, e.g., 
10 Gullino, "Angiogenesis, tumor vascularization, and potential interference with tumor growth" 
pp. 178-1 84 in Mihich (ed. 1985) Bioloeical Responses in Cancer Plenum. Similarly, Tumor 
angiogenesis factor (TAF) is released at a higher level in tumor cells than their normal 
counterparts. See, e.g., Folkman (1992) Angiogenesis and Cancer, Sem. Cancer Biol. 
Various techniques which measure the release of these factors are described in 
15 Freshney (1994), supra. Also, see, Unkless, et al. (1974) J. Biol. Chem. 249:4295-4305; 

Strickland and Beers (1976) J. Biol. Chem. 251:5694-5702; Whur, et al. (1980) Br. J. Cancer 
42:305-312; Gullino, "Angiogenesis, tumor vascularization, and potential interference with 
tumor growth" pp. 178-184 in Mihich (ed. 1985) Biological Responses in Cancer Plenum; 
and Freshney (1985) Anticancer Res . 5:111-130. 
20 Invasiveness into Matrigel 

The degree of invasiveness into Matrigel or some other extracellular matrix 
constituent can be used as an assay to identify compounds that modulate prostate cancer- 
associated sequences. Tumor cells exhibit a good correlation between malignancy and 
invasiveness of cells into Matrigel or some other extracellular matrix constituent, hi this 
25 assay, tumorigenic cells are typically used as host cells. Expression of a tumor suppressor 
gene in these host cells would decrease invasiveness of the host cells. 

Techniques described in Freshney (1994), supra, can be used. Briefly, the level of 
invasion of host cells can be measured by using filters coated with Matrigel or some other 
extracellular matrix constituent. Penetration into the gel, or through to the distal side of the 
30 filter, is rated as invasiveness, and rated histologically by number of cells and distance 

moved, or by prelabeling the cells with 125 I and counting the radioactivity on the distal side of 
the filter or bottom of the dish. See, e.g., Freshney (1984), supra. 
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Tumor growth in vivo 

Effects of prostate cancer-associated sequences on cell growth can be tested in 
transgenic or immune-suppressed mice. Knock-out transgenic mice can be made, in which 
the prostate cancer gene is disrupted or in which a prostate cancer gene is inserted. Knock- 
5 out transgenic mice can be made by insertion of a marker gene or other heterologous gene 
into the endogenous prostate cancer gene site in the mouse genome via homologous 
recombination. Such mice can also be made by substituting the endogenous prostate cancer 
gene with a mutated version of the prostate cancer gene, or by mutating the endogenous 
prostate cancer gene, e.g., by exposure to carcinogens. 

10 A DNA construct is introduced into the nuclei of embryonic stem cells. Cells 

containing the newly engineered genetic lesion are injected into a host mouse embryo, which 
is re-implanted into a recipient female. Some of these embryos develop into chimeric mice 
that possess germ cells partially derived from the mutant cell line. Therefore, by breeding the 
chimeric mice it is possible to obtain a new line of mice containing the introduced genetic 

15 lesion (see, e.g., Capecchi, et al. (1989) Science 244:1288-1292). Chimeric targeted mice can 
be derived according to Hogan, et al. (1988) Manipulating the Mouse Embryo: A Laboratory 
Manual CSH Press; and Robertson (ed. 1987) Teratocarcinomas and Embryonic Stem Cells: 
A Practical Approach IRL Press, Washington, D.C. 

Alternatively, various immune-suppressed or immune-deficient host animals can be 

20 used. For example, genetically athymic "nude" mouse (see, e.g., Giovanella, et al. (1974) J. 
Natl. Cancer lost. 52:921-930), a SCID mouse, a thymectomized mouse, or an irradiated 
mouse (see, e.g., Bradley, et al. (1978) Br. J. Cancer 38:263-272; Selby, et al. (1980) Br. J. 
Cancer 41 : 52-61) can be used as a host. Transplantable tumor cells (typically about 10 6 cells) 
injected into isogenic hosts will produce invasive tumors in a high proportions of cases, while 

25 normal cells of similar origin will not. In hosts which developed invasive tumors, cells 
expressing a prostate cancer-associated sequences are injected subcutaneously. After a 
suitable length of time, preferably 4-8 weeks, tumor growth is measured (e.g., by volume or 
by its two largest dimensions) and compared to the control. Tumors that have statistically 
significant reduction (using, e.g., Student's T test) are said to have inhibited growth. 

30 

Polynucleotide modulators of prostate cancer 
Antisense and RNAi Polynucleotides 
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In certain embodiments, the activity of a prostate cancer-associated protein is down- 
regulated, or entirely inhibited, by the use of antisense polynucleotide, i.e., a nucleic acid 
complementary to, and which can preferably hybridize specifically to, a coding mRNA 
nucleic acid sequence, e.g., a prostate cancer protein mRNA, or a subsequence thereof. 
5 Binding of the antisense polynucleotide to the mRNA reduces the translation and/or stability 
of the mRNA. 

In the context of this invention, antisense polynucleotides can comprise naturally- 
occurring nucleotides, or synthetic species formed from naturally-occurring subunits or their 
close homologs. Antisense polynucleotides may also have altered sugar moieties or inter- 

10 sugar linkages. Exemplary among these are the phosphorothioate and other sulfur con tainin g 
species which are known for use in the art. Analogs are comprehended by this invention so 
long as they function effectively to hybridize with the prostate cancer protein mRNA. See, 
e.g., Isis Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA. 

Such antisense polynucleotides can readily be synthesized using recombinant means, 

15 or can be synthesized in vitro. Equipment for such synthesis is sold by several vendors, 
including Applied Biosystems. The preparation of other oligonucleotides such as 
phosphorothioates and alkylated derivatives is also well known to those of skill in the art. 

Antisense molecules as used herein include antisense or sense oligonucleotides. 
Sense oligonucleotides can, e.g., be employed to block transcription by binding to the anti- 

20 sense strand. The antisense and sense oligonucleotide comprise a single-stranded nucleic 
acid sequence (either RNA or DNA) capable of binding to target mRNA (sense) or DNA 
(antisense) sequences for prostate cancer molecules. A preferred antisense molecule is for a 
prostate cancer sequences in Tables 1 A-4, or for a ligand or activator thereof. Antisense or 
sense oligonucleotides, according to the present invention, comprise a fragment generally at 

25 least about 14 nucleotides, preferably from about 14 to 30 nucleotides. The ability to derive 
an antisense or a sense oligonucleotide, based upon a cDNA sequence encoding a given 
protein is described in, e.g., Stein and Cohen (1988) Cancer Res. 48:2659-2668; and van der 
Krol, et al. (1988) ttmTerhniqiigg 6:958-976. 

RNA interference is a mechanism to suppress gene expression in a sequence specific 

30 manner. See, e.g., Brumelkamp, et al. (2002) Sciencexpress (21March2002); Sharp (1999) 
Genes Dev. 13:139-141; and Cathew (2001) Curr. Op. Cell Biol. 13:244-248. In mammalian 
cells, short, e.g., 21 nt, double stranded small interfering RNAs (siRNA) have been shown to 
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be effective at inducing an RNAi response. See, e.g., Elbashir, et al. (2001) Nature 41 1 :494- 
498. The mechanism may be used to downregulate expression levels of identified genes, e.g., 
treatment of or validation of relevance to disease. 

5 Ribozvmes 

In addition to antisense polynucleotides, ribozymes can be used to target and inhibit 
transcription of prostate cancer-associated nucleotide sequences. A ribozyme is an RNA 
molecule that catalytically cleaves other RNA molecules. Different kinds of ribozymes have 
been described, including group I ribozymes, hammerhead ribozymes, hairpin ribozymes, 

10 RNase P, and axhead ribozymes (see, e.g., Castanotto, et al. (1994) Adv. in Pharmacology 
25: 289-317 for a general review of the properties of different ribozymes). 

The general features of hairpin ribozymes are described, e.g., in Hampel, et al. (1990) 
Nucl. Acids Res. 18:299-304; European Patent Publication No. 0 360 257; U.S. Patent No. 
5,254,678. Methods of preparing are well known to those of skill in the art. See, e.g., WO 

15 94/26877; Ojwang, et al. (1993) Proc. Natl. Acad. Sci. USA 90:6340-6344; Yamada, et al. 
(1994) Human Gene Therapy 1:39-45; Leavitt, et al. (1995) Proc. Natl. Acad. Sci. USA 
92:699-703; Leavitt, et al. (1994) Human Gene Therapy 5:1151-120; and Yamada, et al. 
(1994) Virology 2 05:121-126. 

Polynucleotide modulators of prostate cancer may be introduced into a cell containing 

20 the target nucleotide sequence by formation of a conjugate with a ligand binding molecule, as 
described in WO 91/04753. Suitable ligand binding molecules include, but are not limited to, 
cell surface receptors, growth factors, other cytokines, or other ligands that bind to cell 
surface receptors. Preferably, conjugation of the ligand binding molecule does not 
substantially interfere with the ability of the ligand binding molecule to bind to its 

25 corresponding molecule or receptor, or block entry of the sense or antisense oligonucleotide 
or its conjugated version into the cell. Alternatively, a polynucleotide modulator of prostate 
cancer may be introduced into a cell containing the target nucleic acid sequence, e.g., by 
formation of an polynucleotide-lipid complex, as described in WO 90/10448. It is 
understood that the use of antisense molecules or knock out and knock in models may also be 

30 used in screening assays as discussed above, in addition to methods of treatment. 

Thus, in one embodiment, methods of modulating prostate disorders, e.g., cancer in 
cells or organisms, are provided. In one embodiment, the methods comprise administering to 
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a patient, e.g., to a cell within the patient, an anti-prostate cancer antibody that reduces or 
eliminates the biological activity of an endogenous prostate cancer protein. Alternatively, the 
methods comprise administering to a cell or organism a recombinant nucleic acid encoding a 
prostate cancer protein. This may be accomplished in many ways. In a preferred 
5 embodiment, e.g., when the prostate cancer sequence is down-regulated in prostate cancer, 
such state may be reversed by increasing the amount of prostate cancer gene product in the 
cell. This can be accomplished, e.g., by overexpressing the endogenous prostate cancer gene 
or administering a gene encoding the prostate cancer sequence, using known gene-therapy 
techniques, e.g.. In a preferred embodiment, the gene therapy techniques include the 

10 incorporation of the exogenous gene using enhanced homologous recombination (EHR), e.g., 
as described in PCT/US93/03868, hereby incorporated by reference in its entirety. 
Alternatively, e.g., when the prostate cancer sequence is up-regulated in prostate cancer, the 
activity of the endogenous prostate cancer gene is decreased, e.g., by the administration of a 
prostate cancer antisense nucleic acid. 

15 In one embodiment, the prostate cancer proteins of the present invention may be used 

to generate polyclonal and monoclonal antibodies to prostate cancer proteins. Similarly, the 
prostate cancer proteins can be coupled, using standard technology, to affinity 
chromatography columns. These columns may then be used to purify prostate cancer 
antibodies useful for production, diagnostic, or therapeutic purposes. In a preferred 

20 embodiment, the antibodies are generated to epitopes unique to a prostate cancer protein; that 
is, the antibodies show little or no cross-reactivity to other proteins. The prostate cancer 
antibodies may be coupled to standard affinity chromatography columns and used to purify 
prostate cancer proteins. The antibodies may also be used as blocking polypeptides, as 
outlined above, since they will specifically bind to the prostate cancer protein. 

25 

Methods of identifying variant prostate cancer-associated sequences 

Without being bound by theory, expression of various prostate cancer sequences is 
correlated with prostate cancer or other prostate disorders. Accordingly, disorders based on 
mutant or variant prostate cancer genes may be determined. In one embodiment, the 
30 invention provides methods for identifying cells containing variant prostate cancer genes, 

e.g., determining all or part of the sequence of at least one endogenous prostate cancer genes 
in a cell. This may be accomplished using many sequencing techniques. In a preferred 
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embodiment, the invention provides methods of identifying the prostate cancer genotype of 
an individual, e.g., deter minin g all or part of the sequence of at least one prostate cancer gene 
of the individual. This is generally done in at least one tissue of the individual, and may 
include the evaluation of a number of tissues or different samples of the same tissue. The 
5 method may include comparing the sequence of the sequenced prostate cancer gene to a 
known prostate cancer gene, e.g., a wild-type gene. 

The sequence of all or part of the prostate cancer gene can then be compared to the 
sequence of a known prostate cancer gene to determine if differences exist. This can be done 
using many known homology programs, such as Bestfit, etc. In a preferred embodiment, the 

10 presence of a difference in the sequence between the prostate cancer gene of the patient and 
the known prostate cancer gene correlates with a disease state or a propensity for a disease 
state, as outlined herein. 

In a preferred embodiment, the prostate cancer genes are used as probes to determine 
the number of copies of the prostate cancer gene in the genome. 

15 In another preferred embodiment, the prostate cancer genes are used as probes to 

determine the chromosomal localization of the prostate cancer genes. Information such as 
chromosomal localization finds use in providing a diagnosis or prognosis in particular when 
chromosomal abnormalities such as translocations, and the like are identified in the prostate 
cancer gene locus. 

20 

Administration of pharmaceutical and vaccine compositions 

In one embodiment, a therapeutically effective dose of a prostate cancer protein or 
modulator thereof, is administered to a patient. By "therapeutically effective dose" herein is 
meant a dose that produces effects for which it is administered. The exact dose will depend 

25 on the purpose of the treatment, and will be ascertainable by one skilled in the art using 
known techniques (e.g., Ansel, et al. (1992) Pharmaceutical Dosage Forms and Drug 
Delivery; Lieberman (1993) Pharmaceutical Dosage Forms (vols. 1-3, Dekker, ISBN 
0824770846, 082476918X, 0824712692, 0824716981; Lloyd (1999) The Art. Science and 
Technolo gy of Pharmaceutical Compounding Amer. Pharma. Assn.; and Pickar (1999) 

30 Dosage Calculations Thomson). Adjustments for prostate cancer degradation, systemic 

versus localized delivery, and rate of new protease synthesis, as well as the age, body weight, 
general health, sex, diet, time of administration, drug interaction, and the severity of the 
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condition may be necessary, and will be ascertainable with routine experimentation by those 
skilled in the art. U.S. Patent Application N. 09/687,576 further discloses the use of 
compositions and methods of diagnosis and treatment in prostate cancer is hereby expressly 
incorporated by reference. 
5 A "patient 5 5 for the purposes of the present invention includes both humans and other 

animals, particularly mammals. Thus the methods are applicable to both human therapy and 
veterinary applications. In the preferred embodiment the patient is a mammal, preferably a 
primate, and in the most preferred embodiment the patient is human. The patient typically 
will suffer from a prostate proliferative disorder, e.g., malignant or non-malignant, and may 

10 include cancer of other related conditions or disorders. 

The administration of the prostate cancer proteins and modulators thereof of the 
present invention can be done in a variety of ways as discussed above, including, but not 
limited to, orally, subcutaneously, intravenously, intranasally, transdermally, 
intraperitoneally, intramuscularly, intrapulmonary, vaginally, rectally, or intraocularly. In 

15 some instances, e.g., in the treatment of wounds and inflammation, the prostate cancer 
proteins and modulators may be directly applied as a solution or spray, or via catheter. 

The pharmaceutical compositions of the present invention comprise a prostate cancer 
protein in a form suitable for administration to a patient. In the preferred embodiment, the 
pharmaceutical compositions are in a water soluble form, such as being present as 

20 pharmaceutically acceptable salts, which is meant to include both acid and base addition 
salts. "Pharmaceutically acceptable acid addition salt" refers to those salts that retain the 
biological effectiveness of the free bases and that are not biologically or otherwise 
undesirable, formed with inorganic acids such as hydrochloric acid, hydrobromic acid, 
sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, 

25 propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic acid, succinic 
acid, fumaric acid, tartaric acid, citric acid, benzoic acid, cinnamic acid, mandelic acid, 
methanesulfonic acid, ethanesulfonic acid, p-toluenesulfonic acid, salicylic acid, and the like. 
"Pharmaceutically acceptable base addition salts" include those derived from inorganic bases 
such as sodium, potassium, lithium, ammonium, calcium, magnesium, iron, zinc, copper, 

30 manganese, aluminum salts, and the like. Particularly preferred are the ammonium, 

potassium, sodium, calcium, and magnesium salts. Salts derived from pharmaceutically 
acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, 
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substituted amines including naturally occurring substituted amines, cyclic amines and basic 
ion exchange resins, such as isopropylamine, trimethylamine, diethylamine, triethylamine, 
tripropylamine, and ethanolamine. 

The pharmaceutical compositions may also include one or more of the following: 
5 carrier proteins such as serum albumin; buffers; fillers such as microcrystalline cellulose, 
lactose, corn and other starches; binding agents; sweeteners and other flavoring agents; 
coloring agents; and polyethylene glycol. 

The pharmaceutical compositions can be administered in a variety of unit dosage 
forms depending upon the method of administration. For example, unit dosage forms 

10 suitable for oral administration include, but are not limited to, powder, tablets, pills, capsules 
and lozenges. It is recognized that prostate cancer protein modulators (e.g., antibodies, 
antisense constructs, ribozymes, small organic molecules, etc.) when administered orally, 
should be protected from digestion. This is typically accomplished either by complexing the 
molecule(s) with a composition to render it resistant to acidic and enzymatic hydrolysis, or by 

15 packaging the molecule(s) in an appropriately resistant carrier, such as a liposome or a 
protection barrier. Means of protecting agents from digestion are well known in the art. 

The compositions for administration will commonly comprise a prostate cancer 
protein modulator dissolved in a pharmaceutically acceptable carrier, preferably an aqueous 
carrier. A variety of aqueous carriers can be used, e.g., buffered saline and the like. These 

20 solutions are typically sterile and generally free of undesirable matter. These compositions 
may be sterilized by conventional, well known sterilization techniques. The compositions 
may contain pharmaceutically acceptable auxiliary substances as required to approximate 
physiological conditions such as pH adjusting and buffering agents, toxicity adjusting agents 
and the like, e.g., sodium acetate, sodium chloride, potassium chloride, calcium chloride, 

25 sodium lactate, and the like. The concentration of active agent in these formulations can vary 
widely, and will be selected primarily based on fluid volumes, viscosities, body weight, and 
the like in accordance with the particular mode of administration selected and the patient's 
needs (e.g., (1980) Remington's Pharmaceutical Science (15th ed.); and Hardman, et al. (eds. 
2001) Goodman & Gilman: The Pharmacological Basis of Therapeutics McGraw-Hill. 

30 Thus, a typical pharmaceutical composition for intravenous administration would be 

about 0.1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per patient per 
day may be used, particularly when the drug is administered to a secluded site and not into 
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the blood stream, such as into a body cavity or into a lumen of an organ. Substantially higher 
dosages are possible in topical administration. Actual methods for preparing parenterally 
administrable compositions will be known or apparent to those skilled in the art, e.g., 
Remington's Pharmaceutical Science and Goodman and Gilman: The Pharmacological Basis 
5 of Therapeutics, supra. 

The compositions containing modulators of prostate cancer proteins can be 
administered for therapeutic or prophylactic treatments. In therapeutic applications, 
compositions are administered to a patient suffering from a disease (e.g., a cancer) in an 
amount sufficient to cure or at least partially retard or arrest the disease and its complications. 

10 An amount adequate to accomplish this is defined as a "therapeutically effective dose." 

Amounts effective for this use will depend upon the severity of the disease and the general 
state of the patient's health. Single or multiple administrations of the compositions may be 
administered depending on the dosage and frequency as required and tolerated by the patient. 
The composition should provide a sufficient quantity of the agents of this invention to 

15 effectively treat the patient. An amount of modulator that is capable of preventing or slowing 
the development of cancer in a mammal is referred to as a "prophylactically effective dose." 
The particular dose required for a prophylactic treatment will depend upon the medical 
condition and history of the mammal, the particular cancer being prevented, as well as other 
factors such as age, weight, gender, administration route, efficiency, etc. Such prophylactic 

20 treatments may be used, e.g., in a mammal who has previously had cancer to prevent a 

recurrence of the cancer, or in a mammal who is suspected of having a significant likelihood 
of developing cancer, e.g., based partly on gene expression profiles. 

It will be appreciated that the present prostate cancer protein-modulating compounds 
can be administered alone or in combination with additional prostate cancer modulating 

25 compounds or with other therapeutic agent, e.g., other anti-cancer agents or treatments. 

In numerous embodiments, one or more nucleic acids, e.g., polynucleotides 
comprising nucleic acid sequences set forth in Tables 1 A-4such as antisense polynucleotides, 
silencing RNA, or ribozymes, will be introduced into cells, in vitro or in vivo. The present 
invention provides methods, reagents, vectors, and cells useful for expression of prostate 

30 cancer-associated polypeptides and nucleic acids using in vitro (cell-free), ex vivo or in vivo 
(cell or organism-based) recombinant expression systems. 
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The particular procedure used to introduce the nucleic acids into a host cell for 
expression of a protein or nucleic acid is application specific. Many procedures for 
introducing foreign nucleotide sequences into host cells may be used. These include the use 
of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, 
5 plasma vectors, viral vectors, and many other well known methods for introducing cloned 
genomic DNA, cDNA, synthetic DNA, or other foreign genetic material into a host cell (see, 
e.g., Berger and Kimmel (1987) Guide to Molecular Cloning Techniques from Methods in 
Enzvmology (vol. 152) Academic Press; Ausubel, et al., (eds. supplemented through 1999) 
Current Protocols Lippincott; and Sambrook, et al. (1989) Molecular Cloning: A Laboratory 

10 Manual (2d ed., Vol. 1-3) CSH Press. 

In a preferred embodiment, prostate cancer proteins and modulators are administered 
as therapeutic agents, and can be formulated as outlined above. Similarly, prostate cancer 
genes (including both the full-length sequence, partial sequences, or regulatory sequences of 
the prostate cancer coding regions) can be administered in a gene therapy application. These 

15 prostate cancer genes can include antisense applications, either as gene therapy (i.e., for 

incorporation into the genome) or as antisense compositions, as will be appreciated by those 
in the art. 

Prostate cancer polypeptides and polynucleotides can also be administered as vaccine 
compositions to stimulate HTL, CTL, and antibody responses.. Such vaccine compositions 

20 can include, e.g., lipidated peptides (see, e.g.,Vitiello, et al. (1995) J. Clin. Invest. 95:341- 
349), peptide compositions encapsulated in poly(DL-lactide-co-glycolide) ("PLG") 
microspheres (see, e.g., Eldridge, et al. (1991) Molec. Immunol. 28:287-294; Alonso, et al. 
(1994) Vaccine 12:299-306; Jones, et al. (1995) Vaccine 13:675-681), peptide compositions 
contained in immune stimulating complexes (ISCOMS) (see, e.g., Takahashi, et al. (1990) 

25 Nature 344:873-875; Hu, et al. (1998) Clin Exp Immunol. 1 13:235-243), multiple antigen 
peptide systems (MAPs) (see, e.g., Tarn (1988) Proc. Natl. Acad. Sci. USA 85:5409-5413; 
Tarn (1996) J. Immunol. Methods 196:17-32), peptides formulated as multivalent peptides; 
peptides for use in ballistic delivery systems, typically crystallized peptides, viral delivery 
vectors (Perkus, et al., p. 379, in Kaufinann (ed. 1996) Concepts in vaccine development de 

30 Gruyter; Chakrabarti, et al. (1986) Nature 320:535-537; Hu, et al. (1986) Nature 320:537- 
540; Kieny, et al. (1986) AIDS Bio/Technology 4:790-xxx; Top, et al. (1971) J. Infect. Pis. 
124:148-154; Chanda, et al. (1990) Virology 175:535-547), particles of viral or synthetic 
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origin (see, e.g., Kofler, et al. (1996) T Tmmiinol. Methods 192:25-35; Eldridge, et al. (1993) 
Sem. Hematol. 30:16-24; Falo, et al. (1995) Nature Med. 7:649-653), adjuvants (Warren, et 
al. (1986) Annu. Rev. I mmunol 4:369-388; Gupta, et aL (1993) Vaccine 1 1:293-306), 
liposomes (Reddy, et al. (1992) J. Immunol. 148:1585-1589; Rock (1996) Immunol Today 
5 17:131-137), or, naked or particle absorbed cDNA (Ulmer, et al. (1993) Science 259:1745- 
1749; Robinson, et al. (1993) Vaccine 1 1 :957-960; Shiver, et al., p. 423, in Kauftnann (ed. 
1996) Concepts in Vaccine Development de Gruyter; Cease and Berzofsky (1994) Annu. 
Rev. Immunol. 12:923-989; and Eldridge, et al. (1993) Sem. Hematol. 30:16-24). Toxin- 
targeted delivery technologies, also known as receptor mediated targeting, such as those of 

10 Avant Inununotherapeutics, Inc. (Needham, Massachusetts) may also be used. 

Vaccine compositions often include adjuvants. Many adjuvants contain a substance 
designed to protect the antigen from rapid catabolism, such as aluminum hydroxide or 
mineral oil, and a stimulator of immune responses, such as lipid A, Bortadella pertussis or 
Mycobacterium tuberculosis derived proteins. Certain adjuvants are commercially available 

15 as, e.g., Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, Detroit, 
MI); Merck Adjuvant 65 (Merck and Company, Inc., Rahway, NJ); AS-2 (SmithKline 
Beecham, Philadelphia, PA); aluminum salts such as aluminum hydroxide gel (alum) or 
aluminum phosphate; salts of calcium, iron or zinc; an insoluble suspension of acylated 
tyrosine; acylated sugars; cationically or anionically derivatized polysaccharides; 

20 polyphosphazenes; biodegradable microspheres; monophosphoryl lipid A, and quil A. 

Cytokines, such as GM-CSF, interleukin-2, -7, -12, and other like growth factors, may also be 
used as adjuvants. 

Vaccines can be administered as nucleic acid compositions wherein DNA or RNA 
encoding one or more of the polypeptides, or a fragment thereof, is administered to a patient. 
25 This approach is described, for instance, in Wolff, et al. (1990) Science 247:1465-1468 as 
well as U.S. Patent Nos. 5,580,859; 5,589,466; 5,804,566; 5,739,118; 5,736,524; 5,679,647; 
WO 98/04720; and in more detail below. Examples of DNA-based delivery technologies 
include "naked DNA", facilitated (bupivicaine, polymers, peptide-mediated) delivery, 
cationic lipid complexes, and particle-mediated ("gene gun 5 ') or pressure-mediated delivery 
30 (see, e.g., U.S. Patent No. 5,922,687). 

For therapeutic or prophylactic immunization purposes, the peptides of the invention 
can be expressed by viral or bacterial vectors. Examples of expression vectors include 
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attenuated viral hosts, such as vaccinia or fowlpox. This approach involves the use of 
vaccinia virus, e.g., as a vector to express nucleotide sequences that encode prostate cancer 
polypeptides or polypeptide fragments. Upon introduction into a host, the recombinant 
vaccinia virus expresses the immunogenic peptide, and thereby elicits an immune response. 
5 Vaccinia vectors and methods useful in immunization protocols are described in, e.g., U.S. 
Patent No. 4,722,848. Another vector is BCG (Bacille Calmette Guerin). BCG vectors are 
described in Stover, et al. (1991) Nature 351 :456-460. A wide variety of other vectors useful 
for therapeutic administration or immunization, e.g., adeno and adeno-associated virus 
vectors, retroviral vectors, Salmonella typhi vectors, detoxified anthrax toxin vectors, and the 

1 0 like, will be apparent to those skilled in the art from the description herein (see, e.g., Shata, et 
al. (2000) Mol. Med. Today 6:66-71; Shedlock, et al. (2000) J. Leuk. Biol. 68:793-806; Hipp, 
et al. (2000) In Vivo 14:571-85). 

Methods for the use of genes as DNA vaccines are well known, and include placing a 
prostate cancer gene or portion of a prostate cancer gene under the control of a regulatable 

1 5 promoter or a tissue-specific promoter for expression in a prostate cancer patient. The 

prostate cancer gene used for DNA vaccines can encode full-length prostate cancer proteins, 
but more preferably encodes portions of the prostate cancer proteins including peptides 
derived from the prostate cancer protein. In one embodiment, a patient is immunized with a 
DNA vaccine comprising a plurality of nucleotide sequences derived from a prostate cancer 

20 gene. For example, prostate cancer-associated genes or sequence encoding subfragments of a 
prostate cancer protein are introduced into expression vectors and tested for their 
immunogenicity in the context of Class I MHC and an ability to generate cytotoxic T cell 
responses. This procedure may provide for production of cytotoxic T lymphocyte responses 
against cells which present antigen, including intracellular epitopes. 

25 In a preferred embodiment, the DNA vaccines include a gene encoding an adjuvant 

molecule with the DNA vaccine. Such adjuvant molecules include cytokines that increase 
the immunogenic response to the prostate cancer polypeptide encoded by the DNA vaccine. 
Additional or alternative adjuvants are available. 

In another preferred embodiment prostate cancer genes find use in generating animal 

30 models of prostate cancer. When the prostate cancer gene identified is repressed or 

diminished in cancer tissue, gene therapy technology, e.g., wherein antisense RNA directed 
to the prostate cancer gene will also diminish or repress expression of the gene. Animal 
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models of prostate cancer find use in screening for modulators of a prostate cancer-associated 
sequence or modulators of prostate cancer. Similarly, transgenic animal technology 
including gene knockout technology, e.g., as a result of homologous recombination with an 
appropriate gene targeting vector, will result in the absence or increased expression of the 
5 prostate cancer protein. When desired, tissue-specific expression or knockout of the prostate 
cancer protein may be necessary. 

It is also possible that the prostate cancer protein is overexpressed in prostate cancer. 
As such, transgenic animals can be generated that overexpress the prostate cancer protein. 
Depending on the desired expression level, promoters of various strengths can be employed 
10 to express the transgene. Also, the number of copies of the integrated transgene can be 
determined and compared for a determination of the expression level of the transgene. 
Animals generated by such methods find use as animal models of prostate cancer and are 
additionally useful in screening for modulators to treat prostate cancer. 

1 5 Kits for Use in Diagnostic and/or Prognostic Applications 

For use in diagnostic, research, and therapeutic applications suggested above, kits are 
also provided by the invention. In the diagnostic and research applications such kits may 
include one of the following: assay reagents, buffers, prostate cancer-specific nucleic acids or 
antibodies, hybridization probes and/or primers, antisense polynucleotides, silencing RNA, 

20 ribozymes, dominant negative prostate cancer polypeptides or polynucleotides, small 

molecules inhibitors of prostate cancer-associated sequences, etc. A therapeutic product may 
include sterile saline or another pharmaceutically acceptable emulsion and suspension base. 

In addition, the kits may include instructional materials containing instructions (i.e., 
protocols) for the practice of the methods of this invention. While the instructional materials 

25 typically comprise written or printed materials they are not limited to such. A medium 

capable of storing such instructions and communicating them to an end user is contemplated 
by this invention. Such media include, but are not limited to electronic storage media (e.g., 
magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the like. Such 
media may include addresses to internet sites that provide such instructional materials. 

30 The present invention also provides for kits for screening for modulators of prostate 

cancer-associated sequences. Such kits can be prepared from readily available materials and 
reagents. For example, such kits can comprise one or more of the following materials: a 
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prostate cancer-associated polypeptide or polynucleotide, reaction tubes, and instructions for 
testing prostate cancer-associated activity. Optionally, the kit contains biologically active 
prostate cancer protein. A wide variety of kits and components can be prepared according to 
the present invention, depending upon the intended user of the kit and the particular needs of 
the user. Diagnosis would typically involve evaluation of a plurality of genes or products. 
The genes will be selected based on correlations with important parameters in disease which 
may be identified in historical or outcome data. 
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EXAMPLES 

Example 1: Gene Chip Analyses of Expression Profiles 

Molecular profiles of various normal and cancerous tissues were determined and 
analyzed using gene chips. RNA was isolated and gene chip analysis was performed as 
5 described (Glynne, et al. (2000) Nature 403:672-676; Zhao, et al. (2000) Genes Dev. 14:981- 
993). 

EXAMPLE 2 : Identification of androgen dependent/independent genes 

To identify gene expression changes during the transition from androgen-dependent to 

10 androgen-independent prostate cancer, oligonucleotide microarrays ("K" chips or Affymetrix 
Eos Hu03) were interrogated with cRNAs derived from the human CWR22 prostate cancer 
xenograft model propagated in nude mice (Pretlow, et al. (1993) J. Natl. Cancer Inst. 85:394- 
398). The CWR22 xenograft is androgen-dependent when grown in male Nude mice. 
Androgen-independent sub-lines can be derived by first establishing androgen-dependent 

15 tumors in male mice. The mice are then castrated to remove the primary source of growth 
stimulus (androgen), resulting in tumor regression. Within 3-10 months molecular events 
prompt the tumors to relapse and start growing as androgen-independent tumors. See, e.g., 
Nagabhushan, et al. (1996) Cancer Res. 56:3042-3046; Amler, et al. (2000) Cancer Res. 
60:6134-6141; and Bubendorf, et al. (1999) J. Natl. Cancer Inst. 91:1758-1764. 

20 Using the CWR22 xenograft model, tumors were grown subcutaneously in male nude 

mice. Tumors were harvested at different times after castration. The time points post- 
castration included (in days): 0, 1, 3, 4, 5, 10, 30, 40, 50, 51, 52, 59, 60, 61, 70, 79, 80, 82, 
120, and 125. Analyses also included established androgen-independent xenografts. 
Castration resulted in tumor regression. At day 120 and thereafter, the tumors relapsed and 

25 started growing in the absence of androgen. 

cRNAs were generated by in vitro transcription assays (TVTs) from the different 
samples and were hybridized to the oligonucleotide microarrays (Affymetrix Eos Hu03). 
Hybridization was measured by the average fluorescence intensity (Al), which is directly 
proportional to the expression level of the gene. 

30 Two types of analyses were applied to the results: 

Analysis A: 



91 



BNSDOCID: <WO 0209835aA2_l_> 



WO 02/098358 



PCTYUS02/17594 



The samples were divided into different time groups which included the following 
time points post castration (in days): 1-5, 10, 30-40, 50-82, 120-125. To identify changes in 
gene expression, the following calculations were made: 

1 . The median (or mean, in case there were only 2 samples in a group) was calculated 
5 for each group. 

2. The medians (or means) for each group was compared to one-another. 

3. Genes were selected that exhibited a minimum 2 fold difference in the median (or 
mean) between any of the groups. 

4. The change in gene expression over time was analyzed for each selected gene to look 
10 for specific pattern changes. 

Only genes with an interesting expression pattern during the androgen-ablation time 
course were selected as potential new therapeutic targets and/or diagnostic markers. Among 
the 70,000 gene clusters present on HuOl and Hu02, we identified 820 gene clusters with the 
desired expression patterns. These expression patterns can be broadly defined into the 
1 5 following categories : 

1 . Genes that are expressed early in the time course, then drop off in expression, and 
then express again with emergence of androgen-independence (hi-lo-hi pattern in Table 1 A). 

2. Genes that are expressed early in the time course, then drop off in expression, and do 
not express again with emergence of androgen-independence (hi-lo-lo pattern in Table 1 A). 

20 3. Genes that are not expressed early in the time course, but express only with 
emergence of androgen-independence (lo-lo-hi pattern in Table 1 A). 

4. Genes that are not expressed early in the time course, but then express as androgen is 
withdrawn and continue to express with emergence of androgen-independence (lo-hi-hi 
pattern in Table 1 A). 

25 5. Genes that are not expressed early in the time course, but then express as androgen is 
withdrawn and drop off again with emergence of androgen-independence (lo-hi-lo pattern in 
Table 1A). 

Group 1 is characterized by cell-cycle regulating genes, such as those encoding 
cyclin Bl, p21/WAFl, CDC18-homolog, cyclin A2, cyclin Dl, and possible growth factors 
30 such as hAG2 (anterior gradient 2 homolog) among others. This indicates that interruption of 
growth factor and/or cell cycle pathways prevents the emergence of androgen-independent 
disease, making group 1 genes good targets for treating advanced prostate cancer. 
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Group 2 represents genes that are androgen-dependent, and do not re-express due to 
the lack of androgen signal in the androgen-independent phenotype. This group includes 
genes encoding proteins such as Fibronectin 1, which has been previously shown to be down- 
regulated with androgen-withdrawal (Amler, et al. (2000) Cancer Res, 60:6134-6141). 
5 Group 3 represents genes that are up-regulated by signals that induce the androgen- 

independent phenotype. This group includes genes encoding stanniocalcin 2, c-fos proto- 

i 

oncogene product, vascular endothelial growth factor, the cell surface protein transmembrane 
4 superfamily member 1 and adrenomedullin among others. Adrenomedullin has recently 
been shown to act as an autocrine growth factor for the androgen-independent prostate cancer 

10 cell line DU145 (Rocchi, et al. (2001) Cancer Res. 61:1196-1206), indicating that its up- 
regulation is critical for supporting an androgen-independent phenotype. Blocking 
adrenomedullin function, and/or other genes in this group, prevents the growth of androgen- 
independent tumor cells. 

Group 4 represents genes that are androgen-repressed and are only expressed in the 

15 absence of androgen. This group includes genes encoding the protein tyrosine phosphatase 
interacting protein liprin-alpha 2, the CD24 antigen, and the catalytic subunit for 
phosphatidylinositol 4-kinase amongst others. Patients that are treated for advanced prostate 
cancer by hormone-ablation may have in their bodies cells that have survived hormone- 
ablation and are likely to up-regulate genes that belong to Group 4. Therefore, Group 4 gene 

20 products are particularly good therapeutic targets for treating patients undergoing hormone- 
ablation therapy. 

Group 5 represents genes that are involved in regulating signals that induce an 
androgen-independent phenotype. This group includes genes encoding Rab2 (a Ras-like G 
protein), the Son of Sevenless homolog (a GTP/GDP exchange factor involved in activating 

25 Ras-like proteins), and the p85 regulatory subunit for phosphoinositide-3-kinase (PI3-kinase). 
The PD -kinase pathway has been implicated in providing a survival signal to the prostate 
cancer cell line LNCaP (Lin, et al. (1999) Cancer Res. 59:2891-2897). This indicates that 
ras-like signals and signals dependent on PB-kinase are involved in inducing the androgen- 
independent phenotype. For that reason, Group 5 gene products are particularly good 

30 therapeutic targets for treating patients undergoing hormone-ablation therapy. 
Analysis B: 
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For the second analysis, the samples were divided into 4 time groups which included 
the following time points post castration (in days): 0-1, 3-5, 10-82, >120. To identify 
changes in gene expression, the following analysis was performed: 

1 . Genes were selected that exhibited a minimum of 100 AI units at the 90 th percentile 
5 expression level of samples. 

2. The group mean expression levels for each gene were calculated. The genes were further 
sub-selected to exhibit a minimum 3 fold difference between the group means. 

3. An analysis of variance was then performed on selected genes. From the original 59,680 
gene clusters present on the Hu03 gene chip, only about 1 165 genes with a P value of < 0.01 

10 were identified that also exhibited the above mentioned parameters. 

4. A method was then employed for calculating the positive false discovery rate (pFDR), i.e., 
an estimate of the proportion of false-positives present in a set of findings (Storey and 
Tibshirani (2001) Technical Report, Department of Statistics, Stanford University, CA ). 
This technique was developed explicitly for use with microarray data. The procedure 

15 involves randomly assigning the membership status of each sample to a group and re- 
performing the analysis of variance. In each simulation, the number of group members (6 for 
Group 1, 9 for group 2, 15 for group 3, and 4 for group 4) remained constant, but these 
designations were shuffled and assigned to each sample at random. The permutation was 
performed 1000 times, and for each simulation, the number of findings at P < 0.01 was noted. 

20 The number of false positives under null conditions, was then divided by the number of 
actual findings (n=l 1 65 genes) to obtain an estimate of the proportion of false positive 
findings. After the application of a correction factor, the final estimate for the pFDR was 
about 1%. Thus, one can expect that approximately 12 of the 1 165 findings are false 
positives. 

25 5. The approximately 1 165 genes were clustered by expression pattern to identify specific 
pattern changes. Only genes with an interesting expression pattern during the androgen- 
ablation time course were selected as potential new therapeutic targets and/or diagnostic 
markers. These expression patterns can be broadly defined into the following categories: 
1 . Genes that are expressed early in the time course of androgen withdrawal, then drop off in 

30 expression, and then express again with emergence of androgen-independence (hi-lo-lo-hi 
pattern in Table 2A). 
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2. Genes that are expressed early in the time course, then drop off in expression immediately 
after androgen- withdrawal, and do not express again with emergence of androgen- 
independence (hi-lo-lo-lo pattern in Table 2A). 

3. Genes that are expressed early in the time course, then drop off in expression after several 
5 days of androgen withdrawal, and do not express again with emergence of androgen- 

independence (hi-hi-lo-lo pattern in Table 2A). 

4. Genes that are not expressed early in the time course, but express only with emergence of 
androgen-independence (lo-lo-lo-hi pattern in Table 2A). 

5. Genes that are not expressed early in the time course, but then express as androgen is 
10 withdrawn and continue to express with emergence of androgen-independence (lo-lo-hi-hi 

pattern in Table 2A). 

6. Genes that are not expressed early in the time course, but then express as androgen is 
withdrawn and drop off again with emergence of androgen-independence (lo-lo-hi-lo pattern 
in Table 2A). 

15 Group 1 is characterized by cell-cycle regulating genes and cell growth promoting 

genes, such as those encoding cyclin Bl and CDC45 among others, growth factors/hormones 
such as hAG2 (anterior gradient 2 homolog), adrenomedullin, and stanniocalcin 2 among 
others, and growth factor receptors, such as the bone morphogenic protein receptor type IB 
(BMP-RIB) and the endothelial differentiation lysophosphatidic acid G-protein-coupled 

20 receptor 7 among others. Adrenomedullin has recently been shown to act as an autocrine 
growth factor for the androgen-independent prostate cancer cell line DU145 (Rocchi, et al. 
(2001) Cancer Res. 61:1 196-1206), indicating that its up-regulation is critical for supporting 
an androgen-independent phenotype. This indicates that interruption of growth factor and/or 
cell cycle pathways prevents the emergence of androgen-independent disease, making group 

25 1 genes good targets for treating both localized and advanced prostate cancer and related 
conditions. 

Group 2 represents genes that are androgen-dependent, and do not re-express due to 
the lack of androgen signal in the androgen-independent phenotype. This group includes 
genes encoding proteins such as the endothelial protein C receptor (EPCR) and the potassium 
30 intermediate/small conductance calcium-activated channel (subfamily N, member 2). These 
genes represent targets for treating androgen-dependent prostate cancer and related 
conditions. 
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Group 3 also represents genes that are androgen-dependent, and do not re-express due 
to the lack of androgen signal in the androgen-independent phenotype. This group includes 
genes encoding proteins such as Fibronectin 1, which has been previously shown to be down- 
regulated with androgen-withdrawal (Amler, et al. (2000) Cancer Res. 60:6134-6141), and 
5 genes encoding signaling proteins such as Rho GTPase activating protein 1 . These genes 
represent targets for treating androgen-dependent prostate cancer and related conditions. 

Group 4 represents genes that are up-regulated by signals that induce and maintain the 
androgen-independent phenotype. This group includes genes encoding potential growth 
promoting proteins such as chemokine-like factor (Unigene ID Hs. 15159), colon cancer- 

10 associated protein Micl, and the mitogen-activated protein kinase-activated protein kinase 2. 
Blocking function of these proteins, and/or other genes in this group, prevents the growth of 
androgen-independent tumor cells and related conditions. 

Group 5 represents genes that are androgen-repressed and are only expressed in the 
absence of androgen or that are induced by the absence of androgen. This group includes 

15 genes encoding transcriptional regulators such as the androgen receptor, the DNA activated 
protein kinase (catalytic subunit), and nuclear factor related to kappa B binding protein 
(NFRKB), among others. Patients that are treated for advanced prostate cancer by hormone- 
ablation may have in their bodies cells that have survived hormone-ablation and are likely to 
up-regulate genes that belong to Group 5. Therefore, Group 5 gene products are particularly 

20 good therapeutic targets for treating patients undergoing hormone-ablation therapy. 

Group 6 represents genes that are involved in regulating signals that are induced 
during androgen withdrawal and that induce an androgen-independent phenotype. This group 
includes genes encoding signaling molecules such as phosphomositide-3-kinase (class 2, 
alpha polypeptide), signal transducer and activator of transcription 2 (STAT2), phospholipase 

25 A2 (group DA) and the protein tyrosine phosphatase interacting protein liprin-alpha 2, cell 
surface receptors such as gamma-aminobutyric acid (GAB A) A receptor epsilon subunit, G 
protein-coupled receptor 48, and immune function proteins such as the major 
histocompatibility complex class II DR alpha. The PB-kinase pathway has been implicated 
in providing a survival signal to the prostate cancer cell line LNCaP (Lin, et al. (1999) Cancer 

30 Res. 59:2891-2897). This indicates that ras-like signals and signals dependent on PB-kinase 
are involved in inducing the androgen-independent phenotype. For that reason, Group 6 gene 
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products are particularly good therapeutic targets for treating patients undergoing hormone- 
ablation therapy. 
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TABLE 1 A provides Accession numbers for genes, including expressed sequence tags, (incorporated in their entirety here and throughout the application where Accession 
numbers are provided). Genes with an interesting expression pattern during the androgervabtation time course were selected as potential new therapeutic targets and/or 
diagnostic markers. 820 gene clusters were identified with desired expression patterns. These expression patterns can be broadly defined into the following categories: 

1 . Genes that are expressed early in the time course, then drop off in expression, and then express again with emergence of androgen-independence (hi-lo-hi pattern). 

2. Genes that are expressed early in the time course, then drop off in expression, and do not express again with emergence of androgert-independence {hWo-to pattern). 

3. Genes that are not expressed early in the time course, but express only with emergence of androgen-independence (lo-lo-hi pattern). 

4. Genes that are not expressed early in the time course, but then express as androgen is withdrawn and continue to express with emergence of androgen-independence (to-hi- 
hi pattern). 

5. Genes that are not expressed early in the time course, but then express as androgen is withdrawn and drop off again with emergence of androgen-independence (fo-hi-lo 
pattern). 

Table 1B lists accession numbers for primekeys lacking a unigenelD in table 1A For each probeset is listed a gene cluster number from which oligonucleotides were designed. 
Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity using Clustering and 
Alignment Tools (DoubleTwist, Oakland California). Genbank accession numbers for sequences comprising each cluster are listed in the 'Accession' column. 

Table 1C lists genomic posiOoning for primekeys lacking unigene ID'S and accession numbers in tables 1 A. For each predicted exon is listed genomic sequence source used for 
prediction. Nucleotide locations of each predicted exon are also listed. 



TABLE 1A 








Pkey 


ExAccn 


UnigenelD 


Unigene Title 


pattern 


.102772 


U83115 


Hs.1 61002 


absent in melanoma 1 


hMo-hi 


'128610 


N48373 


Hs.10247 


activated leucocyte cell adhesion molecu 


hi-lo-hi 


102276 


N48373 


Hs.10247 


activated leucocyte cell adhesion molecu 


hi-lo-hi 


100654 


A03758 






hMo-hi 


100655 


A03758 






hMo-hi 


135400 


X78592 


Hs.99915 


androgen receptor (dihydrotestosterone r 


hi-lo-hi 


331363 


AW582256 


"Hs.91011 


anterior gradient 2 (Xenepus laevis) horn 


hMo-hi 


115764 


AW582256 


'Hs.91011 


anterior gradient 2 (Xenepus laevis) horn 


hi-lo-hi 


120483 


BE251623 


Hs.1578 


baculoviral IAP repeat-containing 5 (sur 


hi-lo-hi 


101505 


AA307680 


Hs.75692 


asparagine synthetase 


hMo-hi 


127236 


AW661857 


Hs.98658 


budding uninhibited by benzimidazoles 1 


hMo-hi 


128472 


BE241860 


•Hs.10029 


cathepsin C 


hMo-hi 


102712 


U77949 


Hs.69563 


CDC6 (ceR division cycle 6, S. cerevisi 


hi-lo-hi 


314943 


Y00272 


Hs.1 84572 


cell division cycle 2, G1 to S and G2 to 


hi-lo-hi 


102123 


NM.001809 


"Hs.1594 


centromere protein A (1 7kD) 


hi-lo-hi 


326213 






CR17_hsgiI5867224 


hi-lo-hi 


327110 






CH.21Jisgi|6117842 


hHo-hi 


339186 






CH22.DA59H18.GENSCANJ2-13 


hMo-hi 


337755 






CH22_EM^C000097.GENSCAN.109-2 


hHo-hi 


337674 






CH22_EM^C000097.GENSCAN.67-4 


hHo-hi 


337675 






CH22_EM:AC000097.GENSCAN.67-6 


hi-lo-hi 


333516 






CH22_FGENES.173 1 


hi-lo-hi 


333517 






CH22_FGENES.173_2 


hHo-hi 


333795 






CH22_FGENES.275J 


hMo-hi 


333796 






CH22_FGENES.275_3 


hi-lo-hi 


333808 






CH22_FGENES.279_2 


hi-lo-hi 


333809 






CH22_FGENES.280_2 


hi-lo-hi 


-332792 






CH22_FGENES.3_2 


hi-lo-hi 


334101 






CH22__FGENES.327_59 


hi-lo-hi 


334502 






CH22.FGENES.397 18 


hi-lo-hi 


334616 






CH22_FGENES.411 15 


hHo-hi 


334899 






CH22_FGENES.452_13 


hHo-hi 


334900 






CH22_FGENES.452_14 


hHo-hi 


334902 






CH22_FGENES,452_16 


hi-lo-hi 


334905 






CH22J=GENES.452_20 


hMo-hi 


334906 






CH22_FGENES.452_21 


hHo-hi 


334951 






CH22_FGENES.465_20 


hHo-hi 


335044 






CH22_FGENES.480_1 


hi-lo-hi 


335753 






CH22_FGENES.604_2 


hMo-hi 


335755 






CH22J=GENES.604_4 


hhlo-hi 


333135 






CH22 FGENES.83 11 


hi-lo-hi 


333137 






CH22_FGENES.83_13 


hHo-hi 


333138 






CH22_FGENES.83_15 


hHo-hi 


333139 






CH22_FGENES.83_16 


hHo-hi 


336721 






CH22.FGENES.83-17 


hi-lo-hi 


105012 


AF098158 


Hs.9329 


chromosome 20 open reading frame 1 


hi-lo-hi 


134470 


X54942 


Hs.83758 


COC28 protein kinase 2 


hHo-hi 


134750 


L29073 


Hs.1 139 


cold shock domain protein A 


hHo-hi 


125819 


AA044840 


■Hs.251871 


CTP synthase 


hHo-hi 


102993 


BE262998 


Hs.85137 


cycfinA2 


hHo-hi 


131185 


BE280074 


Hs.23960 


cycfin B1 


hHo-hi 


106350 


AK001404 


■Ks.194698 


cycfin B2 


hHo-hi 


103080 


AU077231 


"Hs.82932 


cycfin D1 (PRAD1: parathyroid adeno mates 


hHo-hi 


101216 


AA284166 


Hs.84113 


cy din-dependent kinase inhibitor 3 (CDK 


hHo-hi 


100589 


AW247430 


Hs.84152 


cystathionine-beta-synthase 


hHo-hi 


130655 


A1831962 


Hs.17409 


cysteine-rich protein 1 (intestinal) 


hHo-hi 


101473 


M22976 


Hs.83834 


cytochrome b-5 


hHo-hi 


101468 


BE538296 


■Hs.1 81028 


cytochrome c oxidase subunit Va 


hHo-hi 


103546 


Z14244 


"Hs.75752 


cytochrome c oxidase subunit Vllb 


hi-lo-hi 


100829 


AA471098 


Hs.278544 


acetyl-Coenzyme A acetyl transferase 2 (a 


hHo-hi 


102469 


AF058293 


Hs.180015 


D-dopachrome tautomerase 


hHo-hi 
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114292 


AI815395 


Hs.184641 


fatty acid desaturase 2 


hi-to-hi 


100656 


BE250162 


■Hs.83765 


dthydrofdate reductase 


hi-to-hi 


133799 


W24087 


Hs.76285 


DKFZP564B167 protein 


W-to-hi 


129113 


BE543205 


'Hs.288771 


DKFZP586A0522 protein 


hMo-hi 


332732 


AF191019 


Hs.8361 


hypothetical protein, estradioMnduced 


hMo-hi 


108846 


AL1 17452 


•Hs.44155 


DKFZP586G1517 protein 


hMo-hi 


133903 


X63692 


"Hs.77462 


DNA (cytosine-5-)-melhyttransfBrase 1 


hMo-hi 


320099 


AW411307 


Hs.114311 


CDC45 (cell division cycle 45, Sxerevis 


hi-to-hi 


321960 


AA723883 


Hs.302446 


hypothetical protein MGC10334 


hMo-fti 


324988 


AK001379 


"Hs.121028 


hypolhefical protein FU10549 


hMo-hi 


303274 


AK00146B 


Hs.62180 


arullin (Drosoph3a Scraps hornolog), act 


hi-to-hi 


301804 


AK001468 


Hs.62180 


amlOn (DrosophOa Scraps hornolog), act 


hi-lo-hi 


300551 


AW408800 


Hs.104859 


hypothetical protein DKFZp762E1312 


hMo-hi 


304541 


AA482561 


Hs.169476 


giyceraldehyde-3-phosphate dehydrogenase 


hMo-hi 


304521 


AA464716 




gb2x82c11.s1 Soares ovary tumor NbHOT H 


hMo-hi 


129075 


BE250162 


•Hs.83765 


dihydrofoiate reductase 


hi-lo-hi 


111003 


N52980 


Hs.83765 


dihydrofolate reductase 


hMo-hi 


115536 


AK001468 


Hs.62180 


anann (DrosophOa Scraps hornolog), act 


hMo-hi 


108857 


AK001468 


Hs.62180 


anfllin (Drosophila Scraps hornolog), act 


hMo-hi 


332397 


AB027249 


Hs.104741 


PDZ-binding kinase; T-ceQ originated pr 


hMo-hi 


330714 


AA263143 


Hs.24596 


RAD51-inleracling protein 


hMo-hi 


104636 


R82252 


Hs.106106 


Homo sapiens cAMP-dependent protein kina 


hMo-hi 


104986 


AW088826 


Hs.22971 


ESTs 


hMo-hi 


105076 


AJ598252 


Hs.37810 


ESTs 


hMo-hi 


105312 


BE613348 


"Hs.23348 


S-phase kinas ©-associated protein 2 (p45 


hMo-hi 


105388 


AW575008 


Hs.11355 


thymopoiefin 


hMo-hi 


105953 


BE410556 


Hs.236556 


hypothetical protein STRAIT 1 1499 


hMo-hi 


106286 


AI765107 


"Hs.274422 


hypothetical protein FU20550 


hMo-hi 


106889 


U46258 


Hs.18349 


HSPC145 protein 


hMo-hi 


109220 


AW958181 


Hs.169998 


ESTs 


hMo-hi 


113158 


AA328102 


Hs.24641 


cytoskeleton associated protein 2 


hMo-hi 


114542 


AW970128 


"Hs.293380 


ESTs 


hMo-hi 


114986 


AK000361 


Hs.133260 


hypothetical protein FU 20354 


hMo-hi 


115291 


BE545072 


'Hs.1 22579 


hypothetical protein FU 10461 


hMo-hi 


115414 


AA662240 


Hs.283099 


AF15q 14 protein 


hMo-hi 


115471 


AK001376 


Hs.59346 


hypothetical protein FU10514 


hMo-hi 


115522 


BE614387 


Hs.47378 


ESTs, Moderately similar to T50635 hypot 


hMo-hi 


115652 


BE093589 


Hs.38178 


hypothetical protein FU23468 


hMo-hi 


116121 


AK001330 


Hs.4B855 


hypothetical protein FU10468 


hMo-hi 


116130 


AW183533 


Hs.38178 


hypothetical protein FU23468 


hMo-hi 


116448 


BE268321 


Hs.208912 


hypothefical protein MGC861 


hMo-hi 


116787 


AW362955 


Hs.15641 


ESTs 


hMo-hi 


118336 


BE327311 


Hs.47166 


HT021 


hi-to-hi 


120649 


AA687322 


Hs.192843 


leucine zipper protein FKSG14 


hMo-hi 


121503 


AA412049 


Hs.290347 


ESTs 


hMo-hi 


121748 


BE536911 


Hs.234545 


Homo sapiens NUF2R mRNA, complete cds 


hMo-hi 


122860 


AA464414 




gbxx78g01 .s1 Soares ovary tumor NbHOT H 


hMo-hi 


123477 


AF217515 


Hs.283532 


uncharacterized bone marrow protein BMQ3 


hMo-hi 


130338 


AJ375726 


■Hs.279918 


hypothetical protein 
monoamine oxidase A 


hMo-hi 


130680 


BE567313 


Hs.183109 


hMo-hi 


131148 


AW953575 


"Hs.303125 


p53-foduced protein PIGPC1 


hMo-hi 


131626 


BE514605 


"Hs.289092 


Homo sapiens cDNA: FU22380 fe, done H 


hMcMii 


131937 


A1907735 


Hs.21446 


Homo sapiens mRNA for K1AA1 716 protein, 


hMo-hi 


131965 


W79283 


Hs.35962 


ESTs 


hMo-hi 


132371 


AA235448 


Hs.46677 


PRO2000 protein 


hMo-hi 


133626 


AW836130 


Hs.75277 


hypothetical protein FU13910 


hMo-hi 


300942 


AW301344 


Hs.1 22908 


Homo sapiens, clone IMAGE:3048353, mRNA, 


hMo-hi 


300953 


AA542845 


Hs.294088 


ESTs 


hMo-hi 


302656 


BE09Q580 


Hs.70704 


Homo sapiens, clone IMAGE:2823731, mRNA, 


hMo-hi 


311928 


T62216 


Hs.270840 


ESTs 


hMo-hi 


313637 


AKD00742 


Hs.1 26774 


L2DTL protein 


hMo-hi 


313832 


AW271106 


Hs.1 33294 


ESTs 


hMo-hi 


316465 


AW574774 


Hs.121692 


ESTs 


hMo-hi 


317202 


AA894880 


Hs.181181 


ESTs 


hMo-hi 


320771 


R74441 


Hs.1 171 76 


poly(A)-cmding protein, nuclear 1 


hi-to-hi 


321635 


AJ820961 


Hs.1 93465 


ESTs 


hi-to-hi 


330867 


AW978991 


Hs.221197 


ESTs 


hMo-hi 


331442 


H77381 


Hs.159420 


ESTs 


hMo-hi 


106654 


AW075485 


H&286049 


phosphoserine aminotransferase 


hMo-hi 


106590 


AJ350260 


Hs.301539 


hypothetical protein MGC2633 


hMo-hi 


128460 


T16206 


Hs.237164 


ESTs, Highly similar to LDHH.HUMAN L4A 


hMo-hi 


114394 


T34462 


Hs.103291 


neurttin 


hi-to-hi 


315936 


AW069807 


Hs.271252 


ESTs 


hMo-hi 


108886 


AW248434 


Hs.91521 


hypothetical protein 


hMo-hi 


129241 


AI878857 


Hs.109706 


hematological and neurological expressed 


hMo-hi 


104978 


AI199268 


Hs.19322 


ESTs, Weakly similar to CGHU7L collagen 


hMo-hi 


129626 


F13272 


Hs.111334 


ferritin, light polypeptide 


hMo-hi 


118895 


BE304917 


Hs.31097 


hypothefical protein FU21478 


hi-to-hi 


332577 


A1826268 


Hs.27769 


ESTs, Weakly similar to MCAT.HUMAN MiTOC 


hMo-hi 


116732 


AW152225 


Hs.165909 


ESTs 


hMo-hi 


106774 


AI216748 


Hs.14587 


ESTs, Weakly simitar to AF1 51 859 1 CGM 


hMo-hi 


108818 


BE61^76 


Hs^03116 


stromal cell-derived factor 2-0ke 1 


hMoW 
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315618 


AI287341 


"Hs.154029 


bHLH factor Hes4 


hMo-hi 


110561 


AA379597 


Hs.5199 


HSPC150 protein simaar to ubiqiritin-con 


hi-lo-hi 


132959 


AW014195 


Hs.61472 


ESTs, Weakly similar to unknown [S.cerev 


hi-lo-hi 


103195 


AA351647 


Hs.2642 


eukaryotic translation elongation factor 


hMo-hi 


100368 


D79987 


Hs.1 53479 


extra spindle poies, S. cerevisiae, homo 


rVMo-hi 


103177 


BE244377 


•Hs.48876 


famesykjiphosphate farnesyl transferase 


hMo-hi 


109141 


AF174600 


Hs.193380 


F-box protein Fbx20 


hMo-hi 


100676 


X02761 


"Hs.287820 


fibronecBnl 


hMo-hi 


100254 


AA452181 


Hs.77643 


FK506-binding protein IB (12.6 kD) 


hMo-hi 


133688 


U71321 


Hs.7557 


FK506-bindin9 protein 5 


hMo-hi 


107129 


AC004770 


"Hs.4756 


flap structure-specific endonuclease 1 


hMo4ti 


102696 


BE540274 


Hs.239 


forkhead box M1 


hMo-hi 


101753 


L11144 


Hs.1907 


galanin 


hMo-hi 


101597 


AA317089 


"Hs.597 


glutamic-oxaloacetic transaminase 1, so! 


hMo-hi 


133512 


L18861 




gb:Human Gollknbp gene, exon 1. 


hMo-hi 


130080 


X14850 


Hs.147097 


H2A histone family, member X 


hMo-hi 


101600 


BE561617 


"Hs.1 19192 


H2A histone family, member Z 


hi-lo-hi 


101332 


J04088 


"Hs.1 56346 


topoisomerase (DNA) U alpha (170kD) 


hMo^ii 


132967 


AA316181 


Hs.61635 


six transmembrane epithelial anGgen of 


hMo-hi 


129726 


H15474 


Hs.1 32898 


fatty acid desaturase 1 


hMo-hi 


106925 


AK002011 


Hs.37558 


hypothetical protein FU1 1 149 


hMo-hi 


105643 


BE621719 


Hs.1 73802 


K3AA0603 gene product 


hMo-hi 


116028 


H59799 


Hs.42644 


Ihioredoxin-like 


hMo-hi 


105437 


AF151076 


Hs.25199 


hypothetical protein 


hMo-hi 


122512 


AF053305 


Hs.98658 


budding uninhibited by benzimidazoies 1 


hMo-hi 


131991 


AF053306 


Hs.36708 


budding uninhibited by benzimidazoies 1 


hMo-hi 


135015 


AW361638 


Hs.278338 


LGN protein 


hMo-hi 


102208 


U22961 




gb:Human mRNA clone with similarity to L 


hMo-hi 


100144 


AL1 19964 


Hs.75616 


sefadin-1 


hMo-hi 


100447 


NM 014767 


Hs.74583 


WAA0275 gene product 


hMo-hi 


116578 


021262 


Hs.75337 


nucleolar phosphoprotein p130 


hMo-hi 


130350 


AA369601 


Hs.239138 


pre-B-ceJI colony-enhancing factor 


hMo-hi 


101045 


J05614 




gb:Human proliferating ceil nuclear anfj 


hMo-hi 


101544 


M31169 




gb:Human propionyl-CoA carboxylase beta- 


hMo-hi 


113674 


NM 014214 


Hs.5753 


inositoi(myo)-1(or 4}-monophosphatase 2 


hMo-hi 


102260 


AL039104 


Hs.1 59557 


karyopherin alpha 2 {RAG cohort 1, impor 


hMo-hi 


100154 


H60720 


Hs,81892 


KIAA0101 gene product 


hMo-hi 


100199 


BE562298 


Hs.71827 


K1AA01 1 2 protein; homolog of yeast ribos 


hi-lo-hi 


100372 


NM_014791 


Hs.184339 


KIAA0175 gene product 


hMo-hi 


100387 


D83777 


"Hs.75137 


KIAA0193 gene product 


hMo-hi 


131514 


BE270734 


"Hs.2795 


lactate dehydrogenase A 


hMo-hi 


102938 


W27518 


Hs.234489 


lactate dehydrogenase B 


hMo-hi 


105811 


BE617695 


Hs.286192 


protein phosphatase 1, regulatory (inhib 


hMo-hi 


101013 


BE300094 


"Hs.227751 


lectin, galactoside-binding, soluble, 1 


hi-to^ii 


124148 


BE30Q094 


"Hs.227751 


lectin, gaiactoside-binding, soluble, 1 


hi-lo-hi 


102968 


AU076611 


Hs.1 54672 


methylene tetrahydrofblate dehydrogenase 


hMo-hi 


130149 


AW067805 


Hs.172665 


methytenetetrahydrofolate dehydrogenase 


hMo-hi 


114767 


AI859865 


Hs.1 54443 


minichromosome maintenance deficient (S. 


hMo-hi 


129168 


AI132988 


Hs.1 09052 


chromosome 14 open reading frame 2 


hMo-hi 


105011 


BE091926 


Hs.16244 


mitotic spindle coiled-coil related prot 


hi-lo-hi 


103023 


AW500470 


Hs.1 17950 


multifunctional polypeptide similar to S 


hMo-hi 


102808 


BE242818 


"Hs.179606 


nuclear RNA helicase, DECD variant of DE 


hMo-hi 


318617 


AW247252 


Hs.75514 


nucleoside phosphorylase 


hMo-hi 


101568 


M81740 


Hs.75212 


ornithine decarboxylase 1 


hMo-hi 


102076 


BE299197 


Hs.179665 


cyclin-dependent kinase inhibitor 1 A (p2 


hMo-hi 


100202 


BE294407 


"Hs.99910 


phosphofructokinase, platelet 


hMo-hi 


101032 


BE206854 


Hs.46039 


phosphogtycerate mutase 2 (muscle) 


hMo-hi 


130553 


AF062649 


"Hs.252587 


pituitary tumor-transforming 1 


hMo-hi 


101626 


M57399 


Hs.44 


pleiotrophin (heparin binding growth fac 


hMo-hi 


101992 


X90725 


Hs.77597 


polo (Drosophia)-flke kinase 


hMo-H 


132164 


AI752235 


Hs.41270 


procolJagen-lysine, 2-oxogtutarate 5-dio 


hMo-hi 


101396 


BE267931 


"Hs.78996 


proliferating cell nuclear antigen 


hMo-hi 


119018 


AA631143 


Hs.1 79809 


ESTs 


hMo-hi 


101840 


AA236291 


Hs.183583 


serine (or cysteine) proteinase inhibito 


hMo-hi 


332640 


BE568452 


Hs.5101 


protein regulator of cytokinesis 1 


hMo-hi 


132543 


BE568452 


Hs.5101 


protein regulator of cytokinesis 1 


hMo-hi 


101118 


AA371931 


"Hs.77422 


proteofipid prot&n 2 (colonic epitheliu 


hi-lo-hi 


109166 


AA219691 


Hs.73625 


RAB6 interacting, Wnesin-like (rabkines 


hMo-hi 


100830 


AC004770 


"Hs.4756 


flap structure-specific endonuclease 1 


hMo-hi 


107059 


BE614410 


Hs.23044 


RAD51 (S. cerevisiae) homolog (E coli Re 


hMo-hi 


321693 


AA227069 


Hs.173737 


ras-related C3 botuDnum toxin substrate 


hMo-hi 


101148 


NMJJ02923 


Hs.78944 


regulator of G-protein signalling 2, 24k 


hMo-hi 


130567 


AA383092 


Hs.1 608 


replication protein A3 (14kD) 


hMo-hi 


103076 


NM_001Q34 


Hs.75319 


ribonucleotide reductase M2 polypeptide 


hMo-hi 


103131 


BE536069 


Hs.2962 


S100 calcium-btndtng protein P 


hMb-hi 


102212 


AW411491 


Hs.75069 


serine hydroxymethyl transferase 2 (mi toe 


hMo^ii 


104254 


AW411425 


Hs.1 80655 


serine/threonine kinase 12 


hMo-hi 


102748 


BE018138 


Hs.24447 


sigma receptor (SR31747 binding protein 


hMo-hi 


102012 


BE259035 


Hs.1 18400 


singed (Drosophila)-like (sea urchin fas 


hMo-hi 


102522 


BE250944 


Hs.1 83556 


solute carrier family 1 (neutral amino a 


hMo-hi 


132994 


AA1 12748 


Hs.279905 


clone HQ0310PRO0310p1 


hMo-hi 


101971 


Z49105 


•Hs.289105 


synovial sarcoma, X breakpoint 2 


hMo-hi 
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126645 


AA316181 


Hs.61635 


six transmembrane epithelial antigen of 


hMo-hi 


103058 


X57348 


Hs.184510 


stratifin 


hMo-hi 


102632 


U66618 


Hs.250581 


SW1/SNF related, matrix associated, acti 


hMo-hi 


103269 


AF230562 


•Hs.289105 


synovial sarcoma, X breakpoint 2 


hMo-hi 


126920 


AA622037 


Hs.166468 


programmed cell death 5 


hMo-hi 


100114 


X02308 


Hs.82962 


thymidylate synthetase 


hMo-hi 


102846 


BE264974 


Hs.6566 


thyroid hormone receptor interactor 13 


hi-to-hi 


131877 


J04088 


'Hs.1 56346 


topoisomerase (DNA) II alpha (170kO) 


hMo-hi 


100866 


U14134 


Hs.75113 


general transcription factor II IA 


hi-to-hi 


133893 


A1434699 


Hs.77356 


transferrin receptor (p90, C071) 


hi-to-hi 


130135 


AA311426 


'Hs.21635 


tubulin, gamma 1 


hi-to-hi 


130287 


AA479005 


Hs.154036 


tumor suppressing subtransferabJe candid 


hi-to-hi 


126180 


L32977 


Hs.3712 


ubiquinokytochrome c reductase, Rieske 


hMo-hi 


101536 


NM.006002 


Hs.77917 


ubiquitin carboxyHermina) esterase L3 


hMo-hi 


102687 


NM 007019 


"Hs.93002 


ubiquitin carrier protein E2-C 


hMo-hi 


103556 


Z19002 


\i$.37096 


zinc finger protein 145 (KnippeWike, e 


hMo-hi 


300022 








hMo-liMo 


133015 


AJ002744 


Hs.246315 


UDP-N-acety^pha-0-galactosamine:potyp 


hWo-hWo 


129642 


NM 001360 


Hs.11806 


7-dehydrocholestero! reductase 


hi-to-lo 


134369 


AF207664 


Hs.8230 


a disintegrin-Bke and metailoprotease'( 


hi-to-to 


300023 








hMo-lo 


125183 


AV660804 


Hs.301417 


AHNAK nucleoprotein (desmoyokin) 


hi-lo-Jo 


101766 


M80899 


'Hs.301417 


AHNAK nucleoprotein (desmoyokin) 


hMo-lo 


133516 


BE265133 


"Hs.217493 


annsxin A2 


hMo-to 


102146 


AW1 62057 


Hs.78629 


ATPase, Na+/K+ transporting, beta 1 poly 


hi-lo-lo 


318538 


AI750979 


HsJ4034 


Homo sapiens clone 24651 mRNA sequence 


hMo-lo 


103554 


A1878826 


Hs 323469 


caveofin 1, caveolae protein, 22kO 


hMo-lo 


329365 






CH.X_hs giI5868838 


hMo-lo 


334282 






CH22 FGENES.369 12 


hMo-lo 


334891 






CH22_FGENES.45?l5 


hMo-lo 


335149 


- 




CH22_FG£NES.499 5 


hMo-lo 


335682 






CH22~FGENES.595 2 


hMo-lo 


335756 






CH22_FGENES.604_5 


hMo-lo 


303951 


AW475081 


Hs.1 72928 


collagen, type I, alpha 1 


hMo-to 


134421 


AU077196 


Hs.82985 


collagen, type V, alpha 2 


hMo-lo 


131101 


BE387561 


Hs.22981 


DKFZP586M1523 protein 


hMo-lo 


124153 


AU077333 


*Hs. 160483 


erythrocyte membrane protein band 7.2 (s 


hMo-lo 


103328 


AU077333 


"Hs.160483 


erythrocyte membrane protein band 7.2 (s 


hMo-to 


322035 


AL1 37517 


•Hs.306201 


hypothetical protein DKFZp56401278 


hMo-to 


301872 


H84730 


Hs.326391 


ESTs, Highly similar to KIAA1437 protein 


hMo-to 


303820 


AB037858 


Hs.1 73484 


hypothetical protein FU 10337 


hMo-to 


304049 


T58155 




gb:yb98h03.s1 Stratagene lung (937210) H 


hMo-to 


304735 


AA576453 




gb:nm75h11.s1 NCI_CGAP_Co9 Homo sapiens 


hMo-to 


306999 


Al 138628 


Hs.308058 


EST, Weakly simitar to zinc finger prot 


hMo-to 


128789 


AW368576 


Hs.1 39851 


caveoJin 2 


hMo-to 


132057 


AB037858 


Hs.1 73484 


hypothetical protein FU 10337 


hi-lo-lo 


114795 


AB037858 


Hs.1 73484 


hypothetical protein FU10337 


hMo-lo 


104204 


AK001691 


Hs.57655 


hypothetical protein FL) 10829 


hMo-to 


105200 


AA328102 


Hs.24641 


cytosketeton associated protein 2 


hMo-to 


105493 


AL047586 


Hs.1 0283 


RNA binding mofif protein 8B 


hMo-lo 


107977 


AJ188161 


Hs.1 44627 


ESTs 


hMo-lo 


108880 


AA766605 


"Hs.47099 


hypothetical protein FU21212 


hMo-to 


111157 


AL109729 


Hs.1 8948 


ESTs, Highly similar to A3 1026 probable 


hMo-lo 


116202 


BE1 59395 


Hs.87089 


ESTs 


hMo-lo 


120669 


AW134519 

Aft 1 wfw |w 


Hs 96125 


ESTs 


hMo-lo 


121847 




Hs 2799 


cariilanp Rnktnn nmfprn 1 


hMo-lo 


124182 


AIR37471 


Hs 107801 

rio. ivi w i 


ESTs > 


hi-lo-lo 


128515 


D CO J .J UU .J 


Hs 10086 

no* i uuuu 


h/np 1 trafKmpmhranp nrntprn Fn 1 4 


hMo-to 


130466 


W19744 


Hs. 180059 


Homo sapiens cDNA FU20653 fis, clone KA 


hMo-to 


131076 


AA749230 


Ue 22666 


ESTs 


hMo-to 


131084 


NM 017413 


Hs.303084 


apeiini peptide bgand for APJ receptor 


hMo-to 


134109 


AA348031 


Hs.7913 


ESTs 


hMo-to 


300258 


AJ478933 


Hs.1 88260 


ESTs 


hMo-lo 


302767 


H94900 


Hs.17882 


ESTs 


hMo-lo 


312391 


R43707 


Hs.1 33159 


ESTs, Weakly similar to PI HUSO salivary 


hMo-Jo 


312689 


AW450461 


Hs.203965 


ESTs 


hMo-lo 


315715 


AJ284219 


Hs.130749 


ESTs 


hMo-to 


315843 


AA679430 


Hs.191897 


ESTs 


hMo-lo 


322447 


AI735759 


Hs.52620 


integrin, beta 8 


hMo-to 


322826 


AI807883 


Hs.201771 


ESTs 


hMo-lo 


324867 


AJ624707 


"Hs.5921 


Homo sapiens cDNAj FU21592 fis, clone C 


hMo-lo 


331336 


AA287450 


Hs.93842 


Homo sapiens cDNA: FU22554 fis, clone 


hMo-to 


331353 


AA953006 


Hs.88143 


ESTs 


hMo-to 


133063 


AI654133 


Hs.30212 


thyroid receptor interacting protein 15 


hMo-to 


311034 


BE567130 


Hs.311389 


ESTs, Moderately similar to PT0375 natur 


hMo-to 


108647 


BE546947 


Hs.44276 


homeoboxdO 


hMo-to 


124955 


AA376768 


■Hs.324841 


hypotheBcal protein FU22622 


hMo-lo 


113923 


AW953484 


Hs.3849 


hypotheticat protein FU22041 similar to 


hMo-lo 


310557 


AI431798 


Hs.1 64192 


ESTs, Weakly similar to Y161.HUMAN HYPOT 


hMo-to 


302943 


AI581344 


Hs.127812 


ESTs, Weakly similar to T17330 hypothefi 


hMo-to 


128453 


X02761 


•Hs.287820 


fibranecfinl 


hMo-to 


305232 


AA670052 


Hs.169476 


glyceratdehyde-3-phosphate dehydrogenase 


hMo-lo 
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117642 


U55184 


'Hs.154145 


hypothetical protein FU11585 


hMo-lo 


115881 


NM_005756 


Hs.184942 


G protein-coupled receptor 64 


hMo-to 


133666 


U56725 


Hs.75452 


heat shock 70k0 protein 2 


hi-lo-to 


103262 


X78565 


Hs.289114 


hexabrachion (tenascin C, cytotactin) 


hMo-lo 


100793 


S69027 




gb:HOX C6=ciass I homeodomain {fragment 


hMo-lo 


102289 


U32114 




hMo-lo 


319109 


Z45662 


Hs.90797 


Homo sapiens clone 23620 mRNA sequence 


hMo-lo 


116357 


AF052107 


Hs.90797 


Homo sapiens clone 23620 mRNA sequence 


hMo-lo 


101497 


W05150 


"Hs.37034 


homeo boxA5 


hMo-to 


105508 


AA173942 


Hs.326416 


Homo sapiens mRNA; cDNA DKFZp564H1916 (f 


hMo-to 


302290 


AA179949 


Hs.175563 


Homo sapiens mRNA; cDMA DKFZp564N0763 (f 


hMo-lo 


102838 


R34657 


Hs.80658 


uncoupling protein 2 (mitochondrial, pro 


hMo-lo 


100235 


D29954 


Hs.13421 


KIAA0056 protein 


hMo-lo 


133507 


NM 002206 


Hs.74369 


integrin, alpha 7 


hMo-to 


125573 


AI351642 


Hs.1 82241 


interferon induced transmembrane protein 


hi-lo-to 


103059 


X57351 


Hs.174195 


interferon induced transmembrane protein 


hi-lo-lo 


330415 


D83777 


•Hs.75137 


KIAA01 93 gene product 


hMo-lo 


303054 


BE265848 


Hs.289080 


colon cancer-associated protein Mid 


hMo-lo 


133579 


X75346 


Hs.75074 


mitogen-activated protein kinase-activat 


hMo-lo 


100528 


BE386801 


Hs.21858 


trinucleotide repeat containing 3 


hMo-lo 


107480 


AF001691 


Hs.74304 


periplakin 


hMo-lo 


133050 


X73424 


Hs.63788 


proptonyl Coenzyme A carboxylase, beta p 


hi-lo-lo 


133061 


AI186431 


Hs.296638 


prostate differentiation factor 
prostate stem ceQ antigen 


hMo-lo 


106390 


AJ297436 


Hs.20166 


hi-lo-lo 


302124 


AA676403 


Hs.145078 


regulator of differentiation (in S. pomb 


hMo-lo 


129823 


X00949 


"Hs.105314 


relaxin 1 (H1) 


hMo-lo 


134444 


BE184455 


*Hs.251754 


secretory leukocyte protease inhibitor ( 


hMo-lo 


103240 


U81961 


Hs.2794 


sodium channel, nonvoltage-gated 1 alpha 


hMo-lo 


115761 


AA366037 


Hs.90911 


solute carrier family 16 (monocarboxyiic 


hMo-lo 


321412 


AI674383 


Hs.22891 


solute carrier family 7 (caiionic amino 
solute carrier family 7 (caiionic amino 


hMo-lo 


126487 


AA283809 


Hs.184601 


hMo-to 


101759 


M80244 


Hs.184601 


solute carrier family 7 (cationtc amino 


hMo-lo 


112941 


AW163034 


Hs.6467 


synaptogyrin 3 


hi-lo-to 


134351 


BE272506 


"Hs.82109 


syndecan 1 


hMo-to 


125924 


BE272506 


"Hs.82109 


syndecan 1 


hMo-lo 


130982 


AA033627 


Hs.21858 


trinucleotide repeat containing 3 


hMo-lo 


133473 


AW301993 


Hs.73980 


troponin T1, skeletal, slow 


hMo-to 


101042 


T46839 


■Hs.10319 


UDP glycosyltransferase 2 family, polype 


hMo-to 


129565 


X77777 


Hs.198726 


vasoactive intestinal peptide receptor 1 


hMo-to 


102992 


M85430 


-Hs.155191 


vflfin 2 (ezrin) 


hMo-to 


106868 


BE185536 


Hs.300816 


Homo sapiens mRNA; cDNA DKFZp564l172 (fr 


io-hito 


132618 


AL050025 


"Hs.279916 


hypothetical protein FU20151 


lo-hi-hi 


100187 


D17793 


'Hs.78183 


aldo-keto reductase family 1 , member C3 


lo-hi-hi 


116334 


AL038450 


Hs.48948 


ATP2C1 calcium transport ATPase, same as 


b-hi-hi 


134454 


NMJ)13230 


Hs.266124 


CD24 antigen (small cell lung carcinoma 


kMu-hi 


302067 


BE542706 


Hs.222399 


CEGP1 protein 


to^Mii 


105500 


AW602166 


Hs.222399 


CEGP1 protein 


to-hMii 


100732 


AA557660 


■Hs.76152 


deconfl 


to-hi-hi 


129265 


AA530892 


Hs.171695 


dual specificity phosphatase 1 


lo-hMii 


117789 


N48294 


Hs.46850 


EST 


fo-hMii 


330786 


BE379594 


"Hs.49136 


ESTs. Moderately similar to ALU7.HUMAN A 


to-hi-hl 


319808 


T58960 


Hs.17283 


hypothetical protein FU 10890 


lo-hi-hi 


303502 


BE174240 




gb:QV1-HT0573-290200-092-f06 HT0573 Homo 


lo-hMii 


116780 


H22566 


"Hs.30098 


ESTs 


lo-hi-hi 


104189 


AB040927 


Hs.301804 


K1AA1494 protein 


lo-hi-hi 


105588 


L43821 


Hs.80261 


enhancer of filamentaSon 1 (cas-Dkedo 


lo-hi-hi 


105731 


AA834664 


Hs.29131 


nuclear receptor coactivator 2 


to-hi-hi 


105772 


H57111 


Hs.221132 


ESTs 


to-hi-hi 


105794 


H24530 


Hs.273294 


hypothetical protein FU20069 


lo-hi-hi 


113098 


N77737 


Hs.8349 


Apobec-1 complementation factor; APOBEC- 


lo-hi-hi 


113803 


AW880709 


"Hs.283683 


chromosome 8 open reading frame 4 


lo-hi-hi 


114530 


AA601038 


Hs.191797 


ESTs 


to-hi-hi 


116188 


AA468183 


Hs.184598 


Homo sapiens cDNA: FU23241 Ms, done C 


to-hi-hi 


117330 


AI904095 


Hs.43423 


ESTs 


lo-hi-hi 


117701 


BE063921 


Hs.295971 


ESTs 


lo-hi-hi 


120911 


AI189754 


Hs.144330 


ESTs 


lo-hi-hi 


124083 


AW195237 


Hs.7734 


hypothetical protein FU22174 


to-hMii 


124690 


AW883529 


Hs.173830 


ESTs 


lo-hi-hi 


130796 


AA088809 


Hs.19525 


hypothetical protein FU22794 


lo-hi-hi 


131524 


AB040927 


Hs.301804 


KIAA1494 protein 


lo-hMii 


132116 


AW960474 


Hs.40289 


ESTs 


to-hi-hi 


132442 


AW970859 


Hs.313503 


ESTs 


lo-hi-hi 


310219 


AI221087 


Hs.147761 


ESTs 


lo-hi-hi 


310598 


AI439136 


Hs.140546 


ESTs 


lo-hMii 


310884 


AW014684 


Hs.232189 


ESTs 


to-hMii 


311587 


A1828254 


Hs.271019 


ESTs, Weakly similar to SMN INHUMAN SURVI 


to-hMii 


312240 


R36475 


Hs.24321 


Homo sapiens cDNA FLJ12028 (is, done HE 


to-hi-hi 


312803 


AA677934 


Hs.117864 


ESTs 


lo-hMti 


314219 


AA262331 


Hs.48376 


Homo sapiens done HB-2 mRNA sequence 


lo-hi-hi 


315052 


AA876910 


Hs.134427 


ESTs 


lo-hi-hi 


331919 


AA446869 


Hs.1 19316 


ESTs 


lo-hi-hi 


133240 


AK001489 


Hs.242894 


ADP-ribosylafion fector-fike 1 


lo-hi-hi 
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134006 


Z45957 


Hs.7837 


124847 


W07701 


•Hs.304177 


129087 


A1348027 


Hs.108557 


131762 


AA744902 


"Hs.107767 


129000 


AA744902 


'Hs.107767 


105713 


AI122843 


"Hs. 18431 9 


118475 


N66845 




118381 


N64513 


Hs.48994 


105057 


AA134233 




131507 


AI826268 


Hs.27769 


124970 


BE272862 


Hs.106534 


130094 


NM.001471 


•Hs.167017 


302357 


X03178 


Hs.198246 


113231 


AA278583 


Hs.180737 


111923 


BE383234 


Hs.25925 


128530 


A1932995 


Hs.183475 


128987 


AI339046 


Hs.107637 


315368 


AB037745 


Hs.104696 


133944 


AW068579 


Hs.7780 


115084 


BE383668 


•Hs.42484 


132883 


AA373314 


Hs.5897 


109623 


AW207385 


Hs.295901 


130577 


M69241 


"Hs.162 


101889 


AF188747 


"Hs.181350 


130336 


AA535210 


"Hs.171995 


128180 


AW949068 


Hs.171995 


134921 


AL137491 


Hs.125511 


302385 


AJ224172 


Hs.204096 


117921 


AA021459 


Ks.306480 


101701 


NM.002436 


Hs.1861 


130356 


AF127577 


Hs.155017 


101763 


AB001914 


Hs.170414 


130342 


U81802 


Hs.154846 


130760 


AW379130 


Hs.16953 


101461 


N98569 


Hs.76422 


134032 


NM 005025 


Hs.78589 


303762 


AF034799 


Hs.30881 


110932 


AA021459 


Hs.306480 


135192 


U83993 


Hs.321709 


133886 


U97276 


Hs.77266 


134142 


BE244053 


Hs.79362 


100877 


X80821 


Hs.302177 


133534 


AU077115 


Hs.201675 


133011 


NR/L006379 


Hs.171921 


132160 


W26406 


Hs.295923 


103110 


X62822 


HS.2554 


130173 


U38847 


Hs.151518 


127435 


X69086 


■Hs.286161 


110520 


N54069 


Hs.4082 


114660 


AA071383 




330541 


NM_002038 


Hs.265827 


101486 


AA506324 


Hs.1852 


332386 


NM_000481 


Hs.102 


100569 


AA535210 


"Hs.171995 


134738 


AU076801 


Hs.89436 


103119 


X63629 


Hs.2877 


302692 


AW176909 


Hs.42346 


105402 


AB014680 


Hs.8786 


102976 


AU077174 


•HsJ288181 


101793 


W01076 


•Hs.1 19663 


129890 


AI868872 


"Hs^82804 


328164 






328648 






330032 






330033 






326816 






337603 






338561 






338562 






333743 






333845 






333849 






334221 






334222 






334578 






336662 






336684 






335289 






335290 






335293 
337182 







G-protein-coupled receptor induced prate lo-tu-hi 

Homo sapiens clone FLB8503 PR02286 mRNA, to-hi-hi 

Homo sapiens clone PP1 057 unknown mRNA to-hMii 

hypothetical protein PR01489 to-hMii 

hypothetical protein PR01489 to-hi-hi 

ESTs, Weakty similar to WAA1006 protein to-hi-hi 

gbza46c1 1 £1 Scares fetal Over spleen lo-hi-hi 

ESTs, Weakly similar to AF1 51 800 1 CG1-4 lo-hi-hi 

gbzo20f10.s1 Stratagene colon (937204) lo-hi-hi 

ESTs, Weakly similar to MCAT_Hl)MAN MITOC lo-hi-hi 

hypothetical protein RJ22625 b-hi-hi 

gamma-amincbutyric acid (GABA) B recepto to-hi-hi 

group-specific component (vitamin D bind to-hMii 

Homo sapiens done 23664 and 23905 mRNA to-hi-hi 

Homo sapiens clone 23860 mRNA sequence lo-hi-hi 

Homo sapiens done 25061 mRNA sequence lo-hi-hi 

hypothetical protein FU 12806 lo-hi-hi 

WAA1324 protein to-hi-hi 

Homo sapiens mRNA; cDNA DKFZp564A072 (fr to-«-hi 

hypothetical protein FU 1 061 8 lo-hi-hi 

Homo sapiens mRNA; cDNA DKFZp586P1622 (f lo-hi-hi 

KIAA0493 protein lo-hi-hi 

insuGn-Dke growth factor binding prate to-hi-hi 

kalOkrein 2, prostatic lo-hi-hi 

kalDkrein 3, (prostate specific antigen lo-hi-hi 

kalDkrein 3, (prostate specific antigen to-hi-hi 

Homo sapiens mRNA; cDNA DKFZp434P1 530 (f lo-hi-hi 

lipophiDn B (uteroglobin family member) lo-hi-hi 

Homo sapiens mRNA; cDNA DKFZp761 E21 12 (f to-hi-hi 

membrane protein, palmitoylated 1 (55kD) lo-hi-hi 

nuclear receptor interacting protein 1 lo-hi-hi 

paired basic amino add cleaving system to-hi-hi 

phosphatkiytinositol 4-kinase, catalytic to-hi-hi 

phosphodiesterase 9A to-hi-hi 

phosphoCpase A2, group II A (platelets, lo-hi-hi 

serine (or cysteine) proteinase inhMo to-hi-hi 

protein tyrosine phosphatase, receptor t to-hi-hi 

Homo sapiens mRNA; cDNA DKFZp761 E21 12 (f lo-hi-hi 

purinergic receptor P2X, Bgand-gated io to-hi-hi 

quiesrin Q6 lo-hi-hi 

retinoblastoma-Iike 2 (p130) lo-hi-hi 

Rsapiens mRNA for ribosomal protein L18 lo-hi-hi 

RNA binding motif protein 5 to-hi-hi 

sema domain, immunoglobulin domain (Ig), to-hi-hi 

seven in absentia (Drosophila) homotog 1 to-hi-hi 

sialytlransferase 1 (bela-gatactostde ai lo-hi-hi 

TAR (HIV) RNA-binding protein 1 lo-hi-hi 

Homo sapiens cDNA FU1361 3 fis, clone PL to-hi-hi 

lectin, galactoside-binding, soluble, 8 to-hi-hi 

gbzm61d05.r1 Stratagene fibroblast (937 to-hi-hi 

interferon, alpha-indudble protein (do lo-hMo 

acid phosphatase, prostate to-hMo 

ammomethyrtransferase (gtydne cleavage to-hMo 

kaflikrein 3, (prostate specific antigen lo-hi-to 

cadherin 17, U cadherin (liver-intestin lo-hi-to 

cadherin 3, type 1, P-cadherin (placenta to-hMo 

calcineurin-binding protein caJsardn-1 to-hMo 

carbohydrate (chondroitin 6/keratan) sul lo-hMo 

cathepsin H lo-hi-to 

CD59 antigen p 1 8-20 (antigen identified lo-hMo 

Homo sapiens cDNA; FU22704 fis, done H lo-hi-to 

CH.06_hs gi{5868068 lo-hMo 

CR<ffJisgi|6Q04473 lo-hMo 

CR16j>2gi|6682596 lo-hi-to 

CH.16_p2 gi|6682596 lo-hMo 

CH.20J>sgi|6£52458 lo-hi-lo 

CH22_C20H12.6ENSCAN. 16-2 lo-hi-to 

CH22_ElvtAC005500.GENSCAN.421-5 to-hMo 

CH22.EMAC005500.GENSCAM421-6 to-hMo 

CH22_FGENES^64_1 lo-hMo 

CH22_FGEMES^90_3 to-hMo 

CH22^FGENES.290_8 to-hMo 

CH22_FGENES.360_1 to-hMo 

CH22_FGENES.360_3 to-hMo 

CH22_FGENES.406J lo-hMo 

CH22_FGENES.41-1 to-hMo 

CH22_FGENES.46-1 lo-hMo 

CH22L.FGENES.527_2 lo-hi-to 

CH22^FGENES.527_3 to-hMo 

CH22J : G£NES.527_6 lo-hMo 

CH22„FGENES.570-2 lohMo 

CH22_FGENES.617_6 (same as BFH4) lo-hMo 
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335810 






335824 






336054 






333124 






332340 


AP000692 


Hs.129781 


130380 


AI949359 


Hs.143600 


102962 


R50032 


Hs.159263 


331306 


AF102546 


Hs.63931 


319408 


AA448090 


Hs.87359 


312197 


T96203 




312405 


AI523875 




312939 


AA495930 


Hs.24444 


313475 


AA010200 


Hs.175551 


313624 


AA525775 


Hs.292523 


316897 


AA838114 


Hs.221612 


317850 


AJ681545 


Hs.1 52982 


318541 


T30290 


Hs.107515 


321325 


AB033100 


Hs.300646 


321696 


AA628791 


Hs.76228 


322189 


H65014 




322463 


AI242754 


Hs.137306 


322540 


R76593 




323131 


AK002088 


Hs.270124 


323243 


W47525 


Hs.1 10771 


323591 


AA301270 




323753 


AK002161 


Hs.70266 


323835 


AL042005 


Hs.1117 


323926 


AA354572 




324047 


AI433357 


'Hs.271340 


324330 


AA884766 




324753 


AA612626 


Hs.144871 


300702 


AA075481 


Hs.1 11334 


301712 


BE083080 


Hs.274323 


302380 


AA325633 


Hs.136102 


302970 


W05608 


Hs.312679 


303187 


AA1 15962 


Hs.323423 


303194 


AA082000 




305612 


AA782347 


Hs.272572 


304263 


AA062837 




304275 


AA070605 




304309 


AA112147 




305503 


AA759177 


Hs.298148 


308615 


AK000142 


Hs.101774 


309390 


AW080585 




104667 


AI239923 


Hs.30098 


310014 


D60745 


Hs.25925 


318814 


W07361 


Hs.22545 


321896 


C04863 


Hs.47191 


331661 


W52448 


Hs.56147 


332120 


AA609684 


Hs.1 12748 


332256 


AW975028 


Hs.1 02754 


107252 


D60745 


Hs.25925 


112068 


A1264847 


Hs.22545 


117929 


N51075 


Hs.47191 


119637 


W52448 


Hs.56147 


123712 


AA609684 


Hs.1 12748 


124560 


AW975028 


Hs.102754 


105039 


AA907305 


Hs.36475 


105271 


AA807881 


Hs.25329 


106689 


AW296584 


Hs.293782 


106849 


AL137281 


Hs.17110 


107071 


AW385224 


Hs.35198 


108218 


W57550 


Hs.301526 


110930 


BE242691 


Hs.14947 


112098 


R44714 


Hs.106795 


112170 


BE246743 


Hs.288529 


112902 


AL035633 


'Hs.129190 


114877 


AW024162 


Hs.205125 


116312 


BE379794 


Hs.65403 


116739 


H01463 


Hs.93534 


119267 


AA064970 


Hs.118145 


120570 


AA280679 


Hs.271445 


121176 


AL121523 


Hs.97774 


123360 


AA532718 


Hs.178604 


123974 


NWL015678 


Hs.3821 


124777 


R41933 




128046 


AA873285 




128666 


AA808466 


Hs.103395 


130639 


AI557212 


■Hs.17132 


130693 


R68537 


Hs.17962 


131756 


AA443966 


Hs.31595 


131985 


AA503020 


Hs.36563 



CH22_FGENES.61 7_7 Io-hMo 

CH22LFGENE&619 11 (same as BFH5) lo-hi-lo 

CH22_FGENES.683_3 lo4iMo 

CH22L.FGENES.81_8 lo-hMo 

chromosome 21 open reading frame 5 Io-hMo 

type II Gdgi membrane protein Io-hMo 

collagen, type VI, alpha 2 Io-hMo 

dachshund (DrosophOa) homolog Io-hMo 

ESTs, Highly similar to RB18 MOUSE RAS-R Io-hMo 

gb:ye48b07.r1 Soares fetal liver spleen Io-hMo 

gb:tg97d04j(1 NCLCGAP_CLL1 Homo sapiens Io-hMo 

Homo sapiens cDNA: FU22165 lis, clone H lo-hMo 

ESTs Io-hMo 

ESTs Io-hMo 

ESTs Io-hMo 

hypothetical protein RJ13117 Io-hMo 

ESTs Io-hMo 

KIAA protein (similar to mouse paladin) lo-hMo 

amplified in osteosarcoma lo-hi-lo 

gb:yu66f10.r1 Weizmann Olfactory Epithet Io-hMo 

ESTs Io-hMo 

gb:yi60c 1 1 .r1 Soares placenta Nb2HP Homo lo-hMo 

Homo sapiens cDNA FU11226 fe, clone PL lo-hMo 

Homo sapiens cDNA: FU219Q4 fis, done H Io-hMo 

gb:EST1 41 92 Testis tumor Homo sapiens cD lo-hMo 

yeast Sec31 p homolog lo-hi-lo 

tripepHdyl peptidase II lo-hMo 

gb:EST62857 Jurkat T-cells V Homo sapien lo-hMo 

ESTs Io-hMo 

gb:am20a10.s1 Soares_NFL_T_GBC_S 1 Homos io-hMo 

Homo sapiens cDNA FU13752 fis, clone PL lo-hMo 

ferritin, light polypeptide lo-hMo 

Homo sapiens. Similar to sialyltransfera lo-hMo 

KIAA0853 protein Io-hMo 

EST lo-hMo 

ESTs, Moderately similar to B Chain B, to-hi-to 

gbsui26fD7.r1 Stratagene neuroepiihelium to-hMo 

hemoglobin, alpha 2 lo-hi-to 

gb:zmG5b11.s1 Stratagene corneal stroma Io-hMo 

gb:zm53h09.s1 Stratagene fibroblast (937 lo-hMo 

gb.-zm64c06.s1 Stratagene fibroblast (937 lo-hMo 

ESTs, Weakly similar to WAA0565 protei lo-hMo 

hypothetical protein FU23045 lo-hi-lo 

gb:xc33f09.x1 NCI_CGAP_Co1 8 Homo sapiens lo-hi-lo 

ESTs lo-hi-lo 

Homo sapiens clone 23860 mRNA sequence to-hMo 

Homo sapiens cDNA FU12935 fis, clone NT Io-hMo 

ESTs lo-hMo 

ESTs lo-hMo 

Homo sapiens cONA: RJ21543 fis, clone C to-hMo 

ESTs Io-hMo 

Homo sapiens clone 23860 mRNA sequence to-hMo 

Homo sapiens cDNA FU 12935 lis. clone NT to-hMo 

ESTs lo-hMo 

ESTs lo-hMo 

Homo sapiens cDNA: FLJ21 543 fis, clone C to-hMo 

ESTs lo-hi-lo 

ESTs Io-hMo 

ESTs lo-hi-to 

ESTs to-hMo 

Homo sapiens mRNA; cDNA DKFZp434C201 6 (f to-hMo 

ectonudeotide pyrophosphatase/phosphodl to-hMo 

hypothetical protein FU13181 Io-hMo 

ESTs, Weakly similar to ALU INHUMAN ALU S Io-hMo 

Homo sapiens cDNA FU 1 31 36 fis, clone NT lo-hi-to 

hypothetical protein FU22635 Io-hMo 

Human DNA sequence from clone RP5-1046G1 lo-hi-lo 

ESTs to-hMo 

hypothetical protein to-hMo 

ESTs lo-hi-lo 

ESTs Io^iMo 

ESTs, Weakly similar to ALU INHUMAN ALU lo-hMo 

ESTs Io-hMo 

ESTs to-hMo 

neurobeachtn lo-hMo 

gb:yg04ffl9.s1 Soares infant brain 1N1B H lo-hMo 

gb:oh68h05.s1 NC!_CGAP_Kid5 Homo sapiens io-hMo 

hypothetical protein FU14146 lo-hMo 

ESTs to-hMo 

ESTs to-hMo 

ESTs to-hMo 

hypothetical protein FU22418 lo-hMo 
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132932 AW1 18826 Hs.6093 Homo sapiens cDNA: FU 22783 fis, done K lo-hMo 

134696 BE326276 'Hs.8861 ESTs lo-hi-to 

300967 AA565209 Hs.269439 ESTs tott-to 

301182 AW291411 Hs.192531 ESTs, Weakly similar to S00754 zinc fing Io4ii-to 

5 302595 AI699372 Hs.193247 Homo sapiens mRNA; cONA DKFZp434A1 71 (fr fo-hHo 

303132 Al 92981 9 Hs.4055 chromosome 21 open reading frame 50 foJiMo 

303506 AA340605 Hs.105887 ESTs, WeaMy similar to Homdog of rat Z lo-hi-to 

303654 BE246743 Hs.288529 hypothetical protein FU22635 lo-hMo 

310026 AA278233 Hs.100691 ESTs Io4iWo 

10 310056 AI253072 Hs. 1453 83 ESTs Io4iMo 

310353 AI261700 Hs.145544 ESTs Mii-to 

310371 AI262584 Hs.145575 ESTs Io4iMo 

310430 AI670843 Hs.200257 ESTs lo-hi-lo 

, _ 310438 AW022192 Hs^00197 ESTs Io-hMo 

15 310455 AI277603 Hs.145990 ESTs Io-hMo 

310787 AW262580 Hs.147674 WAA1 621 protein lo-hMo 

311067 AI587332 Hs.209115 ESTs to-hWo 

311422 F00677 Hs.101316 ESTs b-hi-fo 

311465 AI758660 Hs.206132 ESTs b-hi-b 

20 312073 AA682393 "Hs. 11 9237 ESTs lo-hi-lo 

312105 T81819 Hs.302251 ' ESTs to^ii-to 

312108 T82331 Us. 127453 ESTs Io-hMo 

312292 AW450103 Hs.151124 ESTs lo-hMo 

312313 AW293341 Hs.122505 ESTs, Weakly similar to 138022 hypothec lo-hMo 

25 312600 AW970985 Hs.290853 ESTs lo^iMo 

312800 AI248774 Hs.126707 hypothetical protein RJ 11 457 lo-hi-ta 

312821 AA699325 Hs269880 ESTs lo-hMo 

313097 AI676164 Hs.204339 ESTs to-hMo 

^_ 313166 AI801098 Hs.151500 ESTs lo-hi-lo 

30 313179 AA927670 Hs.131704 ESTs IcMiMo 

313280 AW960454 Hs.222830 ESTs lo-hMo 

313689 AI608810 Hs.193288 ESTs lo-hi-to 

314146 AI827237 Hs.282884 ESTs Io-hMo 

314305 AI280112 Hs.125232 Homo sapiens cDNAFU 13266 fis, clone OV lo-hi-lo 

35 314456 AI867931 Hs.164595 ESTs lo-hMo 

314465 AA602917 Hs.156974 ESTs lo-hMo 

314881 AI095087 Hs.152299 ESTs, Moderately simflar to ALU5.HUMAN A to#lo 

314916 AA548906 Hs.122244 ESTs Io-hMo 

315043 AA806538 Hs. 130732 KIAA1575 protein Wii-to 

40 315074 AA828284 Hs.136729 Homo sapiens cDNA: FU21 348 fis, done C Io-hMo 

315214 A1915927 Hs.34771 ESTs to^Mo 

315344 AW292176 Hs.245834 ESTs lo-hMo 

315353 AI373949 Hs.279610 hypothetical protein FU 10493 Io-hMo 

. . 315439 T78413 Hs.293696 ESTs lo-hi-lo 

45 315528 R37257 Hs.184780 ESTs lo-hi-lo 

315720 AA292998 Hs. 163900 ESTs lo-hWo 

315772 AW515373 Hs.271249 Homo sapiens cDNA FU 13580 fis, done PL lo-hi-lo 

315841 AW136397 Hs.247572 ESTs Io-hMo 

316042 AI469960 Hs.170698 ESTs Io-hMo 

50 316244 AI640761 Hs.224988 ESTs Io-hMo 

316345 AW139408 Hs.152940 ESTs to-hi-lo 

316625 BE540090 Hs.122156 ESTs lo-hi-to 

316738 AA889055 Hs. 123468 ESTs io-hWo 

316868 AI660898 Hs.195602 ESTs Io-hMo 

55 316905 AW138241 Hs.210846 ESTs lo-hMo 

317224 X73608 "Hs.93029 sparctosteonecBn, cwcv and kazai-fike d lo-hi-lo 

317275 AI809444 Hs.202108 ESTs lo-hMo 

317404 AI806867 Hs.126594 ESTs lo-hi-to 

317488 AW071851 Hs.130628 ESTs lo-hi-lo 

60 317916 AJ565071 Hs.159983 ESTs Io-hMo 

317939 AI986208 Hs.244760 ESTs lo-hi-to 

318486 T23514 gbseq3329 1 -NIB Homo sapiens cDNA done lo-hi-to 

319897 N46574 Hs.43838 ESTs lo-hi-to 

320654 AI160015 Hs.118112 ESTs lo-hMo 

65 320697 N62937 Hs.269109 ESTs lo-hi-lo 

320787 AW088363 Hs.246240 ESTs Io-hMo 

321023 AW294316 Hs.125608 ESTs lo-hi4o 

321899 AW972832 Hs.29468 ESTs lo-hMo 

322939 AA101697 Hs.211270 ESTs to-hUo 

70 323045 AA148950 Hs.1 88836 ESTs lo-hi-to 

323091 A1902456 Hs.210761 ESTs Io-hMo 

323262 AL133990 Hs.190642 ESTs lo-hi-to 

323410 AW118683 Hs.154150 ESTs Io-hMo 

323645 AW445014 Hs.197746 ESTs to-hi-Jo 

75 324598 AW972227 Hs.163986 Homo sapiens cDNA: FU22765 fis, done K lo-hi-to 

324666 T78413 Hs.293696 ESTs Io-hMo 

324674 AA541323 Hs.115831 ESTs lo-hi-to 

324713 AI093930 "Hs.313466 ESTs kMiMo 

324790 AI334367 Hs.159337 ESTs io-hi-lo 

80 324804 AI692552 gbwo73f12jd NOCGAP_lu24 Homo sapiens Io-hMo 

330728 AI905520 Hs.29672 ESTs Io-hMo 

330760 H04588 Hs.30469 ESTs to-hi4o 
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330776 


AW953605 


Hs.21887 


330824 


AB037732 


Hs.61441 


331028 


A1539652 


Hs.28338 


331046 


N66563 


Hs.191358 


331050 


BE007967 


Hs.155795 


331053 


AI949841 


Hs.183146 


331180 


R44692 


Hs.6640 


331313 


M761094 


'Hs.80618 


331337 


N74392 


Hs.50495 


331393 


AW976438 


'Hs.17428 


331432 


AA262451 


Hs.38485 


331517 


AA765603 


Hs.180877 


331686 


AW474960 


Hs.1 82256 


332002 


AI579909 


Hs.105104 


332043 


AA371307 


Hs.1 25056 


332265 


AW770320 


Hs.222413 


332314 


R41396 


Hs.101774 


131517 


AB037789 


Hs.263395 


315352 


AA604799 


Hs.136528 


315498 


AA628539 


Hs.1 16252 


321489 


AJ459177 


Hs.172759 


106099 


NMJH2G68 


Hs.9754 


105726 


NM_012068 


Hs.9754 


319926 


AI820719 


Hs.154662 


314915 


AI673735 


Hs.187748 


315198 


A1741506 


Hs.186753 


324302 


AW972771 


Hs.292471 


331341 


BE541042 


"Hs.23240 


113783 


AL359588 


Hs.7041 


313552 


AI889208 


Hs.17283 


103989 


AA315993 


Hs.105484 


331492 


AK001114 


Hs.53913 


110837 


H03109 


Hs.108920 


330814 


AI955040 


Hs.265398 


312226 


AA315703 


Hs.199993 


102034 


AJ903474 


Hs.230 


134671 


BE263255 


Hs.302749 


131083 


Y09763 


Hs.22785 


309575 


AW168096 


Hs.1 69476 


134332 


D86962 


Hs.81875 


132904 


NNL005518 


Hs.59889 


302910 


M77976 


Hs.251577 


133731 


N71725 


"Hs.272572 


303297 


AFO70623 


Hs.13423 


108732 


AA258888 


Hs.107476 


108731 


AA258888 


Hs.1 07476 


302123 


AB013452 


Hs.144931 


131614 


A8002438 


Hs.29596 


104933 


N94126 


Hs.1 2969 


302235 


AL049987 


Hs.166361 


320574 


AL049443 


Hs.1 61283 


324678 


AI990739 


Hs.77868 


331022 


H03109 


Hs.108920 


332430 


H25350 


Hs.21145 


330601 


U90916 


Hs.82845 


101988 


AF221521 


Hs.8068 


102859 


AL036058 


"Hs.76807 


101363 


M11321 




133968 


AA355986 


Hs.232068 


332530 


M31669 


Hs.1735 


317777 


NMJ)14785 


Hs.47313 


100452 


D87742 


Hs.241552 


112988 


NM 014867 


Hs.5333 


320848 


AB020691 


Hs.198232 


105162 


AL133033 


•Hs.4084 


133905 


AB028974 


Hs.137476 


331406 


BE176893 


Hs.23440 


321441 


AF107493 


Hs.118498 


131913 


AW207440 


Hs.185973 


135424 


U67611 




128506 


L40904 


Hs.100724 


330506 


AI130740 


Hs.6241 


311251 


AI655662 


Hs.197698 


314171 


A1821895 


Hs.193481 


106096 


AW379378 


Hs.170121 


133740 


AW162919 


"Hs.170160 


119521 


W38038 




119546 


W38169 




119559 


W38197 




133797 


AL133921 


Hs.76272 


305096 


AA642964 


Hs.163593 


120256 


M169801 


Hs.98710 



ESTs 

KIAA1311 protein 
WAA1546 protein 
ESTs 
ESTs 

ESTs, Moderately similar to ALU1_HUMAN A 
Human DNA sequence from PAC 75N13 on chr 
hypothetical protein 
ESTs 

RBP1-fike protein 
ESTs 

H3hjstone, family 3B(H3.3B) 

ESTs 

ESTs 

ESTs 

ESTs 

hypothetical protein FLJ23045 

sema domain, transmembrane domain (TM), 

ESTs, Moderately similar to ALUIJWMAN A 

ESTs, Moderately similar to ALU INHUMAN A 

ESTs. Moderately similar to ALU7_HUMAN A 

activating transcription factor 5 

activating transcription factor 5 

DnaJ (Hsp40) homolog. subfamily A, membe 

ESTs, Weakly similar to ALU1JHUMAN ALU S 

ESTs, Weakly similar to ALU1.HUMAN ALU S 

ESTs. Weakly similar to ALU1.HUMAN ALU S 

Homo sapiens cOMA FU13496 fis, clone PL 

hypothetical protein DKFZp762B226 

hypothetical protein RJ10890 

Homo sapiens regenerating gene type IV m 

hypothetical protein FU10252 

HT01 8 protein 

ESTs, Weakly similar to transformation-r 
ESTs 

fibromodutin 

FK506-binding protein 9 (63 kD) 
gamma-aminobutyric acid (GABA) A recepto 
grycera!dehyde-3-phosphate dehydrogenase 
growth factor receptor-bound protein 10 
3-hydroxy-3-rnethytglu taryt-Coenzyme A sy 
hemoglobin, alpha 1 
hemoglobin, alpha 2 

Homo sapiens clone 24468 mRNA sequence 
ATP synthase, H+ transporting, mitochond 
ATP synthase, H+ transporting, mitochond 
ATPase, aminophospholipid transporter (A 
Homo sapiens mRNA from chromosome 5q21-2 
hypothetical protein 

Homo sapiens mRNA; cDNA DKFZp564F1 12 (fr 
Homo sapiens mRNA: cDNA DKFZp586N2020 {f 
ORF 

HT018 protein 

hypothetical protein FLJ22489 
Homo sapiens cDNA: FU21930 fis, clone H 
hematopoietic PBX-interacting protein 
major histocompatibflity complex, dass 

transcription factor 8 (represses inter! 
tnhibin, beta B (acuvin AB beta poiypep 
KIAA0256 gene product 
Ki AA0268 protein 
KIAA0711 gene product 
K1AA0884 protein 
KIAA1025 protein 
KIAA1051 protein 
KIAA1 105 protein 

Homo sapiens LUCA-1 5 protein mRNA, splic 
degenerative spermatocyte (homolog Oroso 
trans aldolase 1 

peroxisome proliferative activated recep 
phosphoinosiiide- 3- kinase, regulatory su 
ESTs 
ESTs 

protein tyrosine phosphatase, receptor t 
RAB2, member RAS oncogene family-like 



reimottastoma-binding protein 2 
ribosomal protein L18a 
hypothetical protein 



to-hi-lo 
lo-hi-to 
to-hi-lo 
to-hi-lo 
to-hi-to 
to-hMo 
lo-hi-to 
Io4n-to 
to-hi-to 
to-hMo 
to-hi-lo 
to-hMo 
lo-hi-to 
lo-hi-to 
lo-hi-to 
to-hi-lo 
to-hMo 
lo-hi-to 
to-hi-lo 
to-hMo 
to-hMo 
lo-hi-to 
lo-hi-to 
lo-hMo 
lo-hi-to 
to-hMo 
lo-hi-to 
to-hi-lo 
to-hi-lo 
lo-hi-to 
lo-hi-to 
lott-to 
lo-hi-lo 
to-hMo 
to-hi-to 
lo-hi-lo 
lo-hi-to 
to-W-lo 
lo-hi-lo 
to-hi-lo 
lo-hi-to 
lo-hi-to 
to-hi-lo 
to-hi-lo 
lo-hMo 
lo-hi-lo 
to-hi-Jo 
lo-hi-to 
lo-hi-to 
to-hi-to 
to-hi-to 
lo-hi-to 
to-hMo 
lo-hi-to 
lo-hi-to 
lo-hi-to 
lo-hi-to 
to-hi-to 
to-hMo 
kMiMo 
to-hi-lo 
to-hi-lo 
to-hi-to 
to-hi-to 
to-hi-to 
to-hi-Jo 
to-hi-to 
lo-hi-to 
to-hMo 
to-hi-to 
to-hi-to 
to-hi-to 
lo-hi-lo 
to-hi-to 
to-hi-to 
to-hMo 
to-hi-to 
to-hi-to 
lo-hi-to 
to-hi-to 
to-hi-to 
lo-hi-lo 
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322919 


AA178955 


Hs.271439 


ESTs 


to-hi-to 


300566 


R34926 


Hs.326392 


son of seven tess (Drosophila) homo tog 1 


to-hi-to 


330694 


AI741617 


Hs.108447 


spinocerebellar ataxia 7 (oBvopontocere 


to-hMo 


302416 


AL1 20259 


Hs.76691 


stannin 


to-hi-to 


319289 


AA037534 


Hs.79059 


transforming growth factor, beta recepto 


to-hi-to 


134656 


AI750878 


Hs.87409 


thrombospondin 1 


lo-hi-lo 


130117 


U06641 


Hs.150207 


UDP glycosyl transferase 2 family, polype 


lo-hi-lo 


124357 


N22401 




gb:yw37g07.s1 Morton Fetal Cochlea Homo 


to-hi-lo 


108293 


AA069155 




gb:zm10f1 1.s1 Stratagene pancreas {93720 


to-hi-to 


108657 


BE567753 


Hs.132955 


BCL2fedenovirus E1B 19kD4nteracting pro 


to-hi-to 


108658 


AA641695 




gb:nr62h10.s1 NCl_CGAP_Lym3 Homo sapiens 


lo-hi-to 


331278 


AA071383 




gb:zm61d05.r1 Stratagene fibroblast (937 


to-hi-lo 


108340 


M069820 


Hs.180909 


peroxiredoxin 1 


to-hWo 


108679 


AA1 15963 


Hs.323423 


ESTs, Moderately similar to B Chain B, 


to-hi-to 


108406 


M075424 


Hs.325505 


ESTs, Moderately similar to HBA_HUMAN HE 


lo-hi-to 


114598 


AA075601 




gb:zm88c05.r1 Stratagene ovarian cancer 


to-hi-to 


108462 


AA079347 




gb:zm96c06.s1 Stratagene colon HT29 {937 


lo-hi-lo 


108466 


AA079409 




gb:zm96h02^1 Stratagene colon HT29 (937 


io-hHo 


108489 


AA082977 




gbzn07MO.r1 Stratagene hNT neuron (937 


lo-hMo 


330859 


AA082977 




gbzn07h10.r1 Stratagene hNT neuron (937 


lo-hWo 


108505 


AA083376 




gb:zn09g08^1 Stratagene hNT neuron (937 


lo-hi-to 


331283 


AA467736 


Hs.275437 


ESTs 


to-hi-to 


100641 


AW068302 


"Hs.182183 


Homo sapiens mRNA forcaldesmon, 3' UTR 


lo-hi-lo-bi 


100642 


AW068302 


•Hs.182183 


Homo sapiens mRNA for caldesmon, 3 1 UTR 


to-hWo-hi 


325889 






CR16Jis gi|5867087 


to-hWo-hi 


338038 






CH22_EMAC005500.GENSCAN.149-9 


lo-hi-lc-hi 


338316 






CH22_E^tAC005500.GENSCAN.304-2 


lo-hi-to-hi 


100999 


H38765 


Hs.80706 


diaphorase (NADH/NADPH) (cytochrome b-5 


to-hMo-hi 


331131 


R54797 




gb:yg87b07.s1 Soares infant brain 1NIB H 


lo-hi-lo-hi 


310955 


AI476732 




ESTs 


lo-hi-lo-hi 


311137 


AW207582 


Hs.196042 


ESTs 


lo-hi-io-hi 


311598 


AW023595 


Hs.232048 


ESTs 


lo-hWo-hi 


313070 


A1422023 


Hs.161338 


ESTs 


to-hWo-hi 


110844 


Ar740792 


Hs.167531 


methylcrotonoyi-Coenzyme A carboxylase 2 


to-hi-to-hi 


120328 


AA923278 


Hs.290905 


ESTs, Weakly similar to protease [Rsapi 


lo-hi-lo-hi 


105914 


AW245680 


Hs.9701 


growth arrest and DNA-damage-inducible, 


to-hi-lo-hi 


129389 


NMJM2445 


•Hs.288126 


spondin 2, extracellular matrix protein 


to-hMo-hi 


102759 


NM_005100 


Hs.788 


A kinase (PRKA) anchor protein (gravin) 
adrenomedulRn 


lo-to-hi 


100168 


H73444 


Hs.394 


lo-to-hi 


102348 


U37519 


Hs.87539 


aldehyde dehydrogenase 8 


lo-to-hi 


134158 


U15174 


Hs.79428 


BCL2/adenovirus E1B 19kOtateracting pro 


lo-lo-hi 


133908 


AU076820 


Hs.325474 


caldesmcn 1 


lo-lo-hi 


101883 


AU076743 


Hs.75613 


CD36 antigen (collagen type 1 receptor, 


lo-to-hi 


327821 






CH05Jisg!]5B67968 


lo-to-hi 


134133 


AA262294 


Hs.180383 


dual specificity phosphatase 6 


lo-lo-hi 


103000 


NM 001975 


"Hs.146580 


enotase 2, (gamma, neuronal) 


lo-to-hi 


109251 


AA194776 


Hs.85935 


EST 


lo-to-hi 


315566 


AE037810 


Hs.18760 


K1AA1389 protein 


lo-to-hi 


324697 


AK000742 


Hs. 126774 


L2DTL protein 


lo-to-hi 


306011 


AA896986 




gb:a!06a08^1 Barstead spleen HPLRB2 Horn 


lo-to-hi 


307111 


AI174528 




gb:an45g10.s1 Gessler Wilms tumor Homo s 


lo-to-hi 


106639 


AV655272 


Hs.20252 


novel Ras family protein 


lo-to-hi 


106753 


AI656166 


Hs.7331 


hypothetical protein FU22316 


lo-to-hl 


107974 


AW956103 


Hs.61712 


pyruvate dehydrogenase kinase, Isoenzyme 


lo-lo-hi 


112033 


R49031 


Hs.22627 


ESTs 


lo-lo-hi 


113816 


H46008 


Hs.31518 


ESTs 


lo-lo-hi 


116024 


AA088767 


'Hs.83883 


transmembrane, prostate androgen induced 


lo-to-hi 


116158 


AA381807 


Hs.61762 


hypoxia-inducible protein 2 


lo-lo-hi 


119071 


R31180 




gb;yh62b02^1 Soares placenta N52HP Homo 


lo-lo-hi 


120132 


W57554 


Hs.125019 


ESTs 


lo-to-hi 


120655 


AA305599 


Hs.238205 


hypothetical protein PRO2013 


to-to-hi 


122411 


AW172356 


Hs.99083 


ESTs 


lo-lo-hi 


320779 


AA815354 


Hs.169898 


ESTs 


lo-to-hi 


321024 


AW246216 


Hs.32058 


Homo sapiens C1orf19 mRNA, partial cds 


lo-to-hi 


321408 


AW081530 


Hs.1 37088 


ESTs. Weakly similar to ALU1.HUMAN ALU S 


lo-lo-hi 


323620 


AA306997 


Hs.268362 


ESTs, Weakly similar to hypothetical pro 


lo-lo-hi 


314946 


AJ097229 


Hs.217484 


ESTs 


lo-lo-hi 


320683 


AA334511 


Ks.26638 


ESTs, Weakly similar to unnamed protein 


lo-lo-hi 


128959 


AI580127 


Hs.107381 


hypothetical protein FU 1 1 200 


lo-lo-hi 


128896 


T53925 


Hs.107 


fibrinogen-fike 1 


lo-to-hi 


133592 


AV652066 


Hs.75113 


genera) transcription factor II1A 


lo-to-hi 


103245 


BE566343 


'Hs.28988 


glutaredoxin {thiol transferase) 


lo-lo-hi 


314785 


A1538226 


Hs.32976 


guanine nucleotide binding protein 4 


lo-to-hi 


103677 


Z83806 




gb:H.sapiens mRNA for axonemal dynein he 


to-to-hi 


131170 


NM 014253 


"Hs.23796 


odz (odd Oz/ien-m, OrosophSa) homoJog 1 


to-to-hi 


131164 


AW013807 


Hs.182265 


kerafin19 


to-to-hi 


100409 


086957 


Hs.80712 


K1AA0202 protein 


lo-lo-hi 


133167 


AW162840 


Hs.6641 


kinesm family member 5C 


lo-to-hi 


319080 


AW967646 


Hs.23023 


ESTs 


to-to-hi 


330706 


AF097994 


Hs.301528 


L-kynurenine/alpha-aminoadipate aminotra 


to-to-hi 


104052 


NM.002407 


Hs.97644 


mammagtobin 2 


to-to-hi 


100547 


M57417 




gb-Xomo sapiens mucin (mucin) mRNA, part 


lo-to-hi 
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103145 X66276 Hs.169849 myosin-binding protein C, slow-type to-lo-hi 

301015 AV655272 Hs.20252 novel Ras family protein lo-Io-tii 

311013 AA224760 "Hs.153 ribosomal protein L7 lo-to-hi 

132050 AI267615 Hs.38022 ESTs lo-to-hi 

132349 AW975654 "Hs.181286 serine protease inhibitor, Kaza! type 1 to-lo-hi 

130889 AW972512 Hs.20985 sin3-associated polypeptide, 30kD to-lo-hi 

130791 AF030403 Hs.199263 Ste-20 related kinase to-lo-hi 

130385 AW067800 Hs.155223 stanniocalcin 2 to-lo-hi 

127229 AA316181 Hs.61635 six transmembrane epithelial antigen of to-lo-hi 

133820 S69681 *Hs.177582 surfactant pulmonary-associated protein lo-to-hi 

129523 M13231 Hs.274509 T cell receptor gamma constant 2 to-lo-hi 

321415 BE621807 Hs.3337 transmembrane 4 superfamfly member 1 lo-to-hi 

131859 AW960564 "Hs.3337 transmembrane 4 superfamily member 1 lo-to-hi 

133444 M63978 Hs.73793 vascular endothelial growth factor lo-to-hi 

332567 AW939251 "Hs.25647 v-fos FBJ murine osteosarcoma viral onco to-lo-hi 

131328 AW939251 "Hs.25647 v-fos FBJ murine osteosarcoma viral onco to-lo-hi 

315901 A1521558 Hs.7331 hypotheOcal protein FU2231 6 to-lo-hi 

104394 AA129551 Hs.172129 Homo sapiens cDNA: FU21409 fis. clone C to-lo-hi 

103739 AA1 15173 gbzn30d02.s1 Stratagene neuroepithelium lo-lo-hi 

103797 AA080912 gb:zn04d03.r1 Stratagene hNT neuron (937 lo-to-hi 

103804 AA129196 gb:zn29d08.r1 Stratagene neuroepithelium lo-lo-hi 
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TABLE 1B 

Pkey: Unique Ecs prabeset identifier number 
CAT number Gene cluster number 
Accession: Gen bank accession numbers 
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Pkey 
108462 
108489 
101216 



131328 



124148 



124153 



CAT Number Accessions 

1 1 6651J AA079347 AA079506 AA079538 AA079442 
1 1 8662.1 AA082977 AA082955 AA082956 

17379J AA284166 AA314707 L25876 L27711 AA09274S N92087 U02681 AA315766 BE385121 AA352693 NMJM5192 AI739135 AI066521 AW173105 

AA257103 AA450169 AW261971 AA305065 A1954494 AW950384 AW732122 AA830348 AA789097 AA777794 AA284072 BE564465 A1005313 
AA804528 AI041134AI700317 AI352491 AA856987 AA769007 AA494334 AA769862 AA831 168 AI1 43496 BE090796 AA831166AI141222 
AI372907 N64843 AI075136 A!076701 AA464156 A1076409 AI273523 AA627383 BE043332 T96666 AA1 58102 AA1 58059 AW340182 
AA257019 AI206700 AI678081 AA757304 AA055005 AW059834 AL039012 

8509J AW939251 NM.005252 AU076596 V01 512 V01512 AW579056 AA249247 AI590359 AW510478 AW518282 BE046054 AW874080 AI268596 

AA996237 A1695592 AI2441 17 AA290764 AM01 957 AA505878 AA428304 W74018 W74016 AA040944 AI272071 AA745909 AA620979 
AA019816 A1245094 AW009706 AA662536 AW024264 AI268601 AA932024 AW513222 AW024169 AI659705 AA932526 AA975329 AI567603 
AI889320 AA514238 AA020837 AI623966 AA843677 AA477453 AA496353 AW372625 AV656426 K00650 W96348 N62388 R95977 AA434270 
AI093633 T27639 AW960245 AW881177 R15253 N36936 F07701 AA319315AA337290 AA284642 AA344052 F05184AA351062 AA378451 
AW794233 AW884380 N36951 R49879 AB022276 AA300350 AW839435 AW191708 BE220350 AA280404 AA485546 AW794235 AV654223 
AW838891 AA295986 M72823 AA335648 AA371089 AW845414 H63166 R12840 AA379680 AA477579 R13148 H71003 H71015 AA362156 
AW750674 AW845415 AA366924 AW608044 AJ570388 R3151 1 R33906 R33921 AW663022 AW360985 AI207838 AW607239 AI672451 
AI573282 AW794752 AA370328 AW998896 AW797239 AW998912 AW794742 AI954543 AI810067 AW073373 AA370325 AW195330 C18106 
AW998736 R79476 AA429721 AI891081 AI381 534 AW0221 37 AW020000AI 630329 N99428 AI870222 Al 971 257 AI9221 96 Ai857753 
AW579397 D56749 AI925005 AI685727 AW805573 AI982678 AI784604 A1005625 AW877772 AI634947 AI950829 AA493243 BE166086 
AI801820 AI925643 AI627992 AW316704 AJ261318 D57757 AA887178 AW770406 AI972075 AI222254 AI675794 D58060 AI701954 D58166 
AI799500 AW805669 AW276098 AW874253 A1962991 AJ248184 AW996924 AIG17462 AW022260 AI885957 BE176841 AA878863 AI697419 
AW662094 A1479529 BE 177025 D57403 AA507952 AW664593 AW800998 A1985773 AA566089 AA442759 AI624670 AI460284 A1800205 
A1537788 A) 537593 AI244382 AA583463 AA922678 AA864382 AI610837 D58070 AA844283 AA947992 N73801 AI453821 058184 AI678887 
AW243755 AA746085 D57742 AA757380 R44148 AA496403 BE 1 80303 AW363528 BE006616 D57395 AW805507 AW805511 AA617991 
AJ373585 H30122 057744 AW805501 D57691 D58148 AW873164 AW768483 D57601 AA77781 2 AA837997 BE180123 D57599 AA485387 
AW022208 D58096 N67917 W95944 AW805506 D57518 D57990 AI074096 D56521 D58151 AA428720 D56648 D57778 AW805504 D57750 
D58108 AW021706 D57449 D57041 D5B277 056935 A1356974 057023 AA018712 H27631 D57851 057514 D57268 D57468 AW805646 
AI278945 D57323 D56986 D57539 D57829 D58078 AW805515 AI348684 D57772 R74449 BE041558 D56746 AW798485 D56640 AA985597 
D56702 D56849 D56874 AW581419 AA470397 057591 AW798984 T27640 N66497 D56803 AA618186 AW805647 D57945 N23726 056637 
N23730 D56992 BE176882 BE176839 BE176909 056757 N68137 D56987 AI559806 AA631437 057464 056718 C17030 T29278 D57377 
AW021936 AW1 1 8330 AA51 5358 056610 AA494092 D56934 T97774 A1473546 R74350 R84834 AA579200 D56616 C03207 D57391 N52416 
D56928 R79209 D56925 AA020879 045546 AIB58769 R20750 T09381 F01435 AW627906 058202 AI933993 F01912 H27552 AA174191 
T16515AW023216AA434146 H83387AI346751 V01512V01512AA576407 AW365140 AA937471 BE174681 AI568829 A1274663 R85530 
AL048225 H83388 AW798734 

31 218_1 8E300094 BE384439 AW794648 NMJ502305 M57678 AI929016 AU076727 283844 Z83844 AI906100 W4451 9 H98497 AA188089 AA572687 

AA035793 W93978 BE409220 AA359751 AA502475 H28319 AA527889 AA432335 AA864762 AA340061 C051 80 W68192 AA32781 1 
AA345871 AI750205 N34093 N86639 AA085753 AA603415 AJ355561 AA442262 N42135 C04367 N57266 AI038364 AI184846 AI928853 
X15256 J04456 AA603552 AA317300 AA588615 AA813495 N40276 AA400624 AW264898 H21418 AA643822 AA603569 AA507955 N44497 
AI000869 AW079049 AA614629 AA303987 AA362817 H54502 N85495 W52256 F30575 AA568129 H26935 W93977 AA373651 AA872398 
AI332540 AW572787 F20782 AA442263 AW301076 AA558556 AA825366 W23842 A1038829 AA302408 AA374629 AA614477 AA341686 
AA374846AA187091 F24764 AA1 57099 AA374853 AA991 592 F26839 AA744090 AA936881 AA374627 AA329755 AA854398 AA618108 
AA973600 AA757956 W44520 AA379779 AA373698 AA369135 AA380039 BE408327 AA3751 17 AA375744 AA380014 AA373556 AI335987 
AA903267 AA828223 F25088 AI246573 AA299386 BE275844 BE275666 BE384214 BE620707 AA975886 AA858048 BE548468 AA193055 
BE274324 AI870164 AA1 29614 AA922761 AA935745 AA374567 A1580916 AA374661 AW239224 AA374466 N52172 F24306 AA300453 
AA363443 AA588627 F19159 AA580021 N90877 AA654335 AA679168 AA573071 AW238834 AA988739 AW239423AA976330 AI074239 
AA99991 1 AJ200930 AI9711 73 Al 187321 AA937760 AI016242 AA373684 AI094874 AI302174 AA641237 AI370974 AI971010 AA400379 
AA679137 A1096579 AI001918 AA5241 01 X14829 AA081302 N30374 AI338782 W74444 AA528232 AI734954 AW188024 AA433857 W92348 
W94431 AI708356 AI753458 AA494460 AA825257 AA614246 AA039477 AJ350213 AI3091 1 0 AA745965 AA291 936 AW001 376 AI066764 
W74407 F30627 AA291937 AA480615 AA931667 AA331315 AI936154 AA824332 AA181109 A1017291 AA934736 AA062637 AA599977 
H54814 AA635624 AI802655 AA564078 R69997 AA716551 F30469 AA96 1 030 Al 126757 A1183943 AI066798 AI419436 AA302095 AA157768 
M95303O AA588476 AA131216T79619 AI752885 AA614820 AA988962 AI143561 AA493182 AI302481 AA301613 R73520 AA069898 
AA374944 AW364221 AA342013 AI244949 F36390 AW050980 N79486 AA1 01160 T68112 AI750204 AA328787 H02617 AA314734 AA527923 
AA307835 AI8851 12 Al 872905 AA534666 AA188363 AM 92490 H45772 AI824700 AI1 84276 AW079473 N29847 AA720843 AA720914 
AA573391 H54416 T59424 AI824457 AA304220 AA482553 W72882 AA627932 H27514 H28400 W68050 H20953 AA635786 H21 376 
AA514046 AI342823 F29905 H25999 AA757144 H21636 F22104 AA428650 F27143 F28346 AA535690 H45771 AA548851 AW170154 H45646 
W92274 AI921614 AA176461 AW170153 AI927284 AI161206 AA594439 T28595 H41 129 A1497579 AA978015 AA328875 AA373653 AA090973 
AA328623 AA328759 AA366468 AA375406 H46976 R86050 H02722 AA328321 AA328205 R62358 AA373717 AA304138 AA304224 AA301603 
H54867 AA374783 AA376232 AA373239 AA374917 AA375673 AA303857 AA376466 AA376461 AA302613 AA304082 AA301731 AA357988 
AA303328 R25744 AA301 587 N78746 H20508 AA659423 R47960 AA825456 AJ001 806 AI2451 1 4 AA729223 AA860271 AI91 3845 H26296 
AA733035 AA340965 AA304291 H27356 H20598 AA129613 R69996 AA157689 H20992 W15630 W16551 H25964 H21754 W01159 W42885 
AA176730 H39504 N39788 AA182956 H27565 AA082164 AA328927 AA339934 H61805 H61804 H45580 AA476229 AA714104 AA507471 
AI262184 AI139474 A1139476 AI001045 AA614374 AA593153 F33347 F34679 T68225 N25703 AA186999 A1623318 F18313 N72069 
AA903161 H38546 H28672 AI880529 A11 28960 AA299183 AW768886 F17445 F30433 AA303984 AA303687 AA309366 H28320 AI659479 
AA627222 AA064882 AA507447 R53171 AA039476 T79704 R36589 T83222 H26453 AA298798 R53415 N84918 F37846 R94423 AA352679 
AA30861 5 AA375442 BE173864 AA353674 R73519 R62478 T59480 AA089852 AI265789 AI077675 T9077O R54006 H46977 AA187168 
AA157123 H21637 R48072 AA814207 R53082 AA305829 R62359 AI818429 AA887755 AA534238 AI813821 AW023928 AA062712 A1698995 
F19074 AA345870 AI658776 AA903325 S44881 AA379844 N86780 AW089895 F29687 W52257 AA131229 AA978007 AW953024 R94945 
H28332 

25750.1 AU077333 M81635 NM.0Q4099 X60067 AI686183 AW401439 T39535 AA302410 AV645727 AV653397 AA317395 AA218582 AA2196B2 

AA227317 AI750900 BE440055 H77491 F12371 AA314714 T74055 AI655647 AA489421 AA346569 AI1 29523 AA094975 AW793582 R97358 
H67966 N72440H79590 H81459H60508 R39623 H60900 H40547 AA377244 AA31 8430 H71 201 R64651 R65629 H72546AW798947 N76974 
H03029 N77701 AW151751 H60925 AA455839 H72947 N58334 N55487 AI299891 AA581634 AV651323 AV651728 AV650086 AV651295 
AV648042 AW02Q600 AI537887 AA429713 AW08Q244 N73463 AA471335 AW150316 AA360851 W01407 BE074301 W21371 T87221 
AA190691 D16906 AWB62400 AV661466 AI357816 AA442743 AI189966 AW887793 BE005206 A1926016 AA317024 AA976151 AA247314 
AI767184 R64644 R62817 057965 N74437 N74385 H60409 N66059 H91 165 R79462 F09991 R26175 H77853 N32590 D56667 AA461 122 
056666 D56903 AW021856 AA374084 R69734 H66894 T81638 T63958 W23935 R67668 AW021682 H81249 H61959 H89852 R79306 W25710 
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W42984 AA384428 AW994316 H95163 H95158 R33688 W46557 AW748451 AA029916 AA463826 AA314287 R23084 AA368891 H02926 
AA310456 H03632 C02397 RS3745 H94639 R32226 R24648 H44502 AA039671 AA345336 W42846 R48024 R79724 R63143 AA379513 
R21780 R80704 T70422 H21580 H46388 R62779 AA579734 N64111 AA344527 AIB6S473 R66666 Z20058 T52284 H95103 R36513 R21874 
R31363AA220939 BE439695 AI1 89683 AA164901 AI539383 AA768249 AA442361 W028S7 AA30331 5 AW952009 AA31 4544 M076799 
5 AA216780 T70338 AA039572 AW629489 AL044620 AA533203 AA043082 AI668619 AW298204 AW195268 A1391606 AA437282 AW304801 

AW085720 W02586 AA863279 T82339 AI356879 BE464557 AI03B992 AJ190018 BE146083 AI860399 AI039572 AI129687 AW468134 
A1436074 A1983509 A1582239 AW663467 AW129557 AA68029B AA460262 H91217 N57879 R66069 N95584 AA040855 AA2271 16 N94486 
H04229 H97877 AI161080 A1074367 AI025767 AI754185 AA888150 AJ356979 R79463 AA029917 R69637 AI810134 AA460820 AI377990 
AI743170 AA8S4637 AA628548 AA664223 AI362196 AA489363 AI361404 AI363155 AA300504 AI678269 AA633851 H61743 A11 61012 
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Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (GO numbers. "Dunham L et at." refers to the publication entitled The DNA sequence of 

human chromosome 22." Dunham I. et al. (1999) Mature 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted 
NLposition: indicates nucleotide positions of predicted exons. 
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it4nnnen 4'wnnorn 

22309950-22309891 


335293 


Dunham, 1. etai. 


Minus 


OlUCXflO 4404M7C 

22316408-2231 6275 


4ieeo 

335682 


Dunham, 1. etaL 


Minus 


25421215-25421093 


335753 


uunnam, i. eia. 


Minus 


257oj535*25761444 




uunnam, i. cl<s. 


Minus 


9 R7Mflfl T7KT7A7 
/OJOUO-ZO / Do/ 4/ 


335756 


Dunham, 1. eta). 


Minus 


25764330-25764251 


336662 


Dunham, i. eta). 


Minus 


2158060-2157993 


336684 


Dunham, 1. eta). 


Minus 


2158060-2157993 


337603 


Dunham, 1. etai. 


Minus 


1299296-1299194 


338561 


Dunham. 1. etai. 


Minus 


22311966-22311656 


338562 


Dunham, t. etai. 


Minus 


22312594-22312465 


339186 


Dunham, t etai. 


Minus 


32339211-32339097 


325889 


5867087 


Plus 


223829-223891 


330032 


6682596 


Plus 


85177-85237 


330033 


6682596 


Rus 


86663-86723 


326213 


5867224 


Minus 


60751-60927 


326816 


6552458 


Plus 


198354-198436 


327110 


6117842 


Rus 


94608-94785 


327821 


5867968 


Rus 


131060-131232 


328164 


5868068 


Minus 


27080-27226 


328648 


6004473 


Rus 


424829424959 


329365 


5668838 


Minus 


107687-107765 
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Table 2A fists about 1 165 genes selected to have an interesting expression pattern during androgen withdrawal of prostate cancer tissue. These genes were selected by 
analysis of variance, such (hat the P value Is (ess than 0.01, the 90th percentile exhibits a minimum of 100 average intensity across all samples, and a comparison of any group 
means shows a minimum 3 fold change. Tne mieresbng expression patterns can be broaoTy defined into me firftowing categories: 

1. Genes that are expressed early in the fime course of androgen withdrawal, then drop of? in expression, and men express again with emergence of androgen-independence 
(hi-lo-lo-ru pattern in table 2A). 

Z Genes that are expressed early in the time course, then drop off in expression immediately after and rogerv withdrawal and do not express again with emergence of androgen- 
independence (hi-lo-lo-lo pattern in table 2A). 

3. Genes that are expressed early in \he fime course, then drop off in expression after several days of androgen withdrawal, and do not express again with emergence of 
androgen-independence (rn-hi-lo-to pattern in table 2A). 

4. Genes that are not expressed early in the time course, but express only with emergence of androgen-independence (lo-to-to-hi pattern in table 2A). 

5. Genes that are not expressed early in the time course, but then express as androgen is withdrawn and continue to express with emergence of androgen-independence flo-lo- 
W-hi pattern in table 2A). 

6. Genes that are not expressed early in the fime course, but then express as androgen is withdrawn and drop off again with emergence of androgen-independence (lo-Jo-bwo 
pattern in table 2A). 

Table 2B lists accession numbers for primekeys lacking a unigenelD in table 2A. For each probeset is listed a gene cluster number from which oligonucleotides were designed. 
Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity using Clustering and 
Alignment Tools (DoubleTwist, Oakland California). Genbank accession numbers for sequences comprising each cluster are listed in the "Accession" column. 

Table 2C lists genomic positioning for primekeys lacking unigene I0*s and accession numbers in table 2A. For each predicted exon is fisted genomic sequence source used for 
prediction. Nucleotide locations of each predicted exon are also listed. 

TABLE 2A: ABOUT 1 165 GENES SELECTED TO HAVE AN INTERESTING EXPRESSION PATTERN DURING ANDROGEN WITHDRAWAL OF PROSTATE CANCER 
TISSUE 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Tiile: Unigene gene title 

Pattern: Broadly defined expression patterns during androgen withdrawal 



Pkey 




1 lninpnf»in 


1 In man a 1 iffa 


ranem 


433412 


AV653759 


Hs.8185 


\AM-*tH prUlclJl, oUlIIUc UcliyUI Oy GllcJoc [| 


KHO-nWO 


429097 


AKDQ1270 


no. I3DUOO 




KHO-nHO 


442731 


niouo id/ 


Ms 111 (\AA 


CO IS 


l_ >- t_ 

KMO-nHO 


420820 


W760QR 


1 lo.sJOUOO J 


Hnmn oanipnc rJnnp ltJl&CE'Ai7QAR0 mPKIA 


KHO-nHO 






Lie 1 1/1/119 


WIAA101Q nmlam 

iwv\i zi o proiem 


lo-to-hMo 


*r ICSJDO 


no 130/ 


U c OfiQfMfi 

ns.zDsiW) 


cots 


lo-to-hWo 






ns. fozoz 


cathepsin O 


(o-to-hMo 


410209 


AI583661 


Hs.60548 


hypothetical protein PR01635 


lo-to-hMo 


428523 


AW974540 


Hs.98626 


ESTs 


lo-lo-hMo 


435847 


W93821 


Hs.39780 


CDA017 protein 


lo-lo-hi-lo 


443967 


AW294013 


Hs.200942 


ESTs 


kHo-hi-lo 


440838 


AA907075 


Hs.131307 


ESTs 


lo-to-hi-to 


404054 






Target Exon 


lo-lo-hWo 


431697 


H66740 


Hs.38540 


ESTs, Weakly similar to ALU4 JIUMAN ALU S 


kHO-hi-lo 


432114 


AL036021 


Hs.8934 


ESTs 


kwo-hHO 


446397 


AW275603 


Hs.200712 


ESTs 


lo-to-hMo 


414094 


H15088 


Hs.31433 


ESTs 


lo-lo-hi-lo 


424005 


AB033041 


Hs.137507 


vang (van gogh, Drosophila)-Uke 2 


lo-lo-hHo 


424401 


H67220 


Hs.169681 


death effector domain-containing 


lo-lo-hMo 


449749 


AI668611 


Hs.49760 


ESTs 


lo-lo-hi-lo 


458368 


BE504731 


Hs.138827 


ESTs 


lo-lo-hMo 


427221 


L15409 


Hs.174007 


von HippeHJndau syndrome 


to-lo-hMo 


432715 


AA247152 


Hs.200483 


ESTs, Weakly similar to K1AA1074 protein 


lo-lo-hWo 


425980 


AA366951 




gb:EST77963 Pancreas tumor 111 Homo sapi 


lo-lo-hi-lo 


412492 


AW962604 




gb:EST374677 MAGE resequences, MAGG Homo 


lo-to-hi-lo 


438882 


AA827695 




gb:od56c02^1 NCI_CGAP_GCB1 Homo sapiens 


lo-lo-hMo 


422473 


U94780 


Hs.1 17242 


meningioma expressed antigen 6 (cofled-c 


to-lo-W-lo 


404211 






NM_005936:Homo sapiens myeloid/lymphoid 


kHO-hi-lo 


423019 


AI640185 


Hs.283626 


ESTs 


lo-kHii-to 


443559 


AI076765 


Hs.269899 


ESTs, Moderately similar to ALU8_HUMAN A 


lo-lo-hi-lo 


444291 


AI598022 


Hs.1 93989 


TAR DNA binding protein 


KHO-hHO 


428065 


AI634046 


Hs.157313 


ESTs 


to-to-hi-Io 


442566 


R37337 


Hs.12111 


ESTs 


lo-lo-hMo 


442202 


BE272862 


Hs.106534 


hypothetical protein FU22625 


to-lo-hMo 


439456 


AI752409 


Hs.109314 


hypothetical protein FU 20980 


kHO-hHb 


423476 


AL035633 




Human DNA sequence from clone RP5-1046G1 


kHo-hMo 


437952 


063209 


Hs.5944 


solute earner family 11 (proton-coupled 


loJo-hMo 


451987 


AA815092 


Hs.77554 


Homo sapiens cDNA FU 14967 fis, clone TH 


kHo-hUo 


453408 


A1804732 


Hs.295963 


ESTs 


lo-te-hHo 


444004 


N39842 


Hs.301444 


WAA1673 


lo-to-hi-lo 


452691 


AA164842 


Hs.192619 


KIAA1600 protein 


b-lo-hwo 


434865 


AW050449 


Hs.1 16507 


ESTs 


lo-lo-hi-lo 


440819 


AI809444 


Hs.202108 


ESTs 


lo-lo-hHO 


419526 


A1821895 


Hs.193481 


ESTs 


. lo-lo-hi-lo 


422072 


AB018255 


Hs.111138 


H0AA0712 gene product 


kHo-hi-lo 


453459 


BE047032 


Hs.257789 


ESTs 


lo-lo-hi-io 


419038 


AW134924 


Hs.190325 


ESTs 


lo-lo-hi-lo 


413243 


AA769266 


Hs.193657 


ESTs 


lo-kHiMo 


432079 


AW972746 




gb:EST384840 MAGE resequences, MAGL Homo 


lo-lo-hMo 
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441328 A1982794 Hs.159473 ESTs Wo-hMo 

416508 R39769 ESTs, Moderately similar to ALU8_HUMAN A to-to-hHo 

451066 AI758660 Hs.206132 ESTs lo-lo-hi-fo 

446017 N98238 Hs.55185 ESTs lo-lo-hMo 

5 447104 R19085 Hs^10706 Homo sapiens cDNA FL/13182 fis, clone NT !o-to-hi-fo 

447211 AL161961 Hs.17767 K1AA1554 protein Wo-hMo 

447765 AW014112 Hs.161390 ESTs lo-to-hi-to 

429540 MB5776 gb:EST02297 Fetal brain, Slratagene (cat lo-lo-hMo 

444314 AI140497 gb:ow76b09.s1 SoaresJetalJiver_spleen_ lo-lo-hMo 

10 414555 N98569 Hs.76422 phosphofipase A2, group IIA (platelets, lo-lo-hMo 

432677 NM 004482 Hs.278611 UDP-N-acetyl-alpha-D-galactosamine:polyp lo-lo-hMo 

422091 AI906339 Hs.97927 ESTs lo-lo-hMo 

423028 H90946 gb:yu86c02rl Soares fetal fiver spleen Wo-hMo 

444040 AF204231 Hs.182982 golgin-67 kvto-hWo 

15 441111 A1806867 Hs.126594 ESTs lo-lo-hi-lo 

418838 AW385224 Hs.35198 ectonucleoUde pyrophosphatase/phosphodi kMo-hMo 

415999 AA172179 Hs.294029 ESTs Ic-io-hMo 

429615 AF258627 Hs.211562 ATP-binding cassette, sub-famfly A (ABC1 lo-lo-hMo 

427774 AA278583 Hs.1 80737 Homo sapiens clone 23664 and 23905 mRNA Mo-hMo 

20 438585 AA811371 Hs.1 23362 ESTs lo-lo-hMo 

424776 AI867931 Hs.164595 ESTs to-to-hMo 

413786 AW613780 Hs.13500 ESTs kMo-hMo 

421077 AK000061 Hs.101590 hypothetical protein fcvlo-hMo 

445837 AI261700 Hs.145544 ESTs lo-lo-hi-lo 

25 449282 AL048056 Hs.23437 Homo sapiens cDNA FLJ1 3555 fis, clone PL lo-lo-hi-lo 

414065 AW515373 Hs.271249 Homo sapiens cDNA FU 13580 fis, clone PL Wo-hMo 

432527 AW975028 Hs. 102754 ESTs Ic-to-hMo 

412093 BE242691 Hs.14947 ESTs lo-lo-hMo 

457121 AI743770 Hs.180513 ESTs, Weakly similar to KIAA0822 protein lo-lo-hMo 

30 417280 AW173116 Hs.250103 ESTs ioto-hMo 

452445 AB002438 Hs.29596 Homo sapiens mRNA from chromosome 5q21-2 lo-lo-hi-lo 

438624 AA889055 Hs.123468 ESTs lo-lo-hMo 

442343 AA992480 Hs.1 29874 ESTs Io4o-hi-to 

401416 C14000338*:gi|7459502|pir||S74665 outer lo-to-hi-lo 

35 437176 AW176909 Hs.42346 calrineurin-binding protein ca)sarcin-1 lo-lo-hi-lo 

451663 A1872360 Hs.209293 ESTs to-io-hi-lo 

449295 AW137268 Hs.270954 ESTs lo-lo-hi-lo 

426848 H72531 Hs.36190 ESTs lo-lo-hi-lo 

445467 A1239832 Hs.15617 ESTs, Weakly similar to ALU4_HUMAN ALU S lo-lo-hMo 

40 418662 AI801098 Hs.151500 ESTs lo-lo-hi-lo 

416239 AL03B450 Hs.48948 ESTs lo-lo-hMo 

428054 A1948688 Hs.266619 ESTs lo-lo-hi-lo 

435284 AA879470 Hs.96849 Homo sapiens cDNA FU1 1492 fis, clone HE lo-lo-hMo 

424332 AA338919 Hs.101615 ESTs lo-lo-hMo 

45 442369 AJ565071 Hs.159983 ESTs lo-lo-hMo 

420717 AA284447 Hs.271887 ESTs to-lo-hMo 

439584 AA838114 Hs.221612 ESTs te-k>-hHo 

440260 A1972867 Hs.7130 copine IV lo-lo-hMo 

426269 H15302 Hs.168950 Homo sapiens mRNA; cDNA DKFZp566A1 046 {f kMo-hMo 

50 428398 AI249368 Hs.98558 ESTs lo-lo-hMo 

407276 A1951118 Hs.326736 Homo sapiens breast cancer antigen NY-BR to-lo-hMo 

409339 AB020686 Hs.54037 ectonucieotide pyrophosphatase/phosphodi lo-lo-hi-lo 

442150 AI368158 Hs.70983 PTPL1-assoriated RhoGAP 1 lo-lo-hMo 

415787 H01463 Hs.93534 ESTs IcMo-hMo 

55 430685 A1690234 Hs.1 91666 ESTs, Weakly similar to GNMSLL retroviru lo-lo-hi-lo 

443794 N94104 Hs.29280 ESTs lo-lo-hi-lo 

446215 AW821329 Hs.14368 SH3 domain binding glutamic acid-rich pr lo-lo-hi-lo 

441285 NMJJG2374 Hs.167 rnicrotubule-associated protein 2 lo-lo-hi-lo 

448738 BE614081 gb:601 50381 5F1 NIHJ/IGC.71 Homo sapiens c lo-lo-hMo 

60 403746 ENSP00000226812*:KIAA1494 protein (Fragm to-lo-hMo 

434022 R18374 Hs.1 17956 ESTs Wo-hMo 

435714 AA699325 Hs.269880 ESTs to^o-hWo 

439848 AW979249 gb:EST391 359 MAGE resequences, MAGP Homo lo-lo-hMo 

421974 AA301270 gb:EST1 41 92 Testis tumor Homo sapiens cD Mo-hMo 

65 433332 AI367347 Hs.44898 Homo sapiens clone TGCCTA00151 mRNA sequ lo-lo-hMo 

449919 AI674685 Hs.200141 ESTs lo-lo-hMo 

407192 AA609200 gb:af12e0is1 SoaresJestis.NHT Homo sap kMo-hMo 

436169 AA888311 Hs.17602 Homo sapiens cONA FU 12381 fis, clone MA lo-lo-hMo 

418624 AI734080 Hs.104211 ESTs lo-lo-hMo 

70 432432 AA541323 Hs.115831 ESTs Ic-to-hMo 

426172 AA371307 Hs.125056 ESTs fcMo-hMo 

401093 C12000586*:gil6330167idbjlBAA86477.1|(A lo-to-W-to 

426716 NMJW6379 Hs.171921 sema domain, immunoglobulin domain (Ig). lo-lo-hWo 

439569 AW602166 Hs.222399 CEGP1 protein lo-lo-hMo 

75 451720 AW970985 Hs.290853 ESTs lo-lo-hMo 

429163 AA884766 gb:am20a10.s1 Soares J^FIJTJ3BC_S1 Homo s lo-lo-hWo 

432435 BE218886 Hs.282070 ESTs lo-to-hMo 

408170 AW204516 Hs.31835 ESTs lo-to-hMo 

433530 BE349534 Hs.281789 ESTs lt>lo-hMo 

80 425776 U25128 Hs.159499 parathyroid hormone receptor 2 kMo-hMo 

430068 AA464964 gb2x80f10.s1 Soares ovary tumor NbHOT H to-UMiMo 

422725 AA315703 Hs.199993 ESTs, Weakly similar to ALUB_HUMAN !!!! lo-lo-hMo 
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432314 AA533447 Hs.312989 ESTs to-kMiHo 

434609 R76593 gb^i60c11 ji Soares placenta Nb2HP Homo lo-to-hi-to 

448760 AA313825 H*l21941 AD036 protein Jo-to-hi-to 

417381 AF164142 Hs.82042 solute earner family 23 (nudeobase Ira to-to-hi-io 

456334 T50392 Hs.271745 ESTs * lo-to-hi-to 

435445 AA737345 Hs.294041 ESTs lo-to-hMo 

411928 AA888624 Hs. 197289 rab3 GTPase-activaiing protein, non-cata lo-lo-hWo 

438869 AR3750Q9 gbiHomo sapiens fuf! length insert cDNA lo-to-hi-to 

423932 T95633 Hs.189703 ESTs to-to-hi-lo 

422222 A1699372 Hs.193247 hypomeficaf protem DKFZp434A171 to-to-hHo 

434941 AW073202 Hs.334825 Homo sapiens cDNA RJ 14752 fis. clone NT lo-to-hMo 

415736 AA827082 Hjl291872 ESTs lo-to-hi-to 

432722 AA830532 Hs. 326 150 ESTs lo-lo-bUo 

435511 AA683336 Hs.189046 ESTs lo-!o-hi-to 

432242 AW022715 Hs.162160 ESTs, Weakly similar to ALU4_HUMAN ALU S lo-lo-hi-lo 

451141 AW772713 Hs^47186 ESTs lo-to-hi-to 

450546 AA010200 Hs.175551 ESTs lo-to-hi-to 

413351 BE086815 ESTs lo-lo-hi-to 

439324 AF086134 Hs.94309 ESTs lo-to-hi-to 

452688 AA721140 Hs.49930 ESTs, Weakly similar to putative pi 50 [H lo-lo-hHo 

415669 NMJ005025 Hs.785B9 serine (or cysteine) proteinase inhibilo to-to-hi-to 

450164 AI239923 Hs.63931 ESTs lo-to-hi-to 

417169 R13550 Hs.246773 ESTs lo-to-hi-to 

443645 R36475 Hs.24321 Homo sapiens cDNA FU12028 fis, done HE lo-to-hi-to 

424878 H57111 Hs^21132 ESTs lo-lo-hMo 

449618 AI076459 Hs.15978 KIAA1 272 protein lo-lo-hHo 

432572 AI660840 Hs.191202 ESTs, WeaWy similar to ALUE HUMAN!)!! lo-to-hMo 

400293 N51002 Hs.306480 Homo sapiens mRNA;cONADKFZp761E21 12 (f lo-lo-hi-lo 

431474 AL133990 Hs.190642 CEGP1 protein lo-b-hMo 

421674 T10707 Hs,296355 hypothetical protein FU23 138 lo-lo-hMo 

438494 AA908678 Hs.130183 ESTs lo-lo-hMo 

425332 AA633306 Hs.127279 ESTs Io-!onhWo 

451411 AA017492 Hs.135655 EST hMtMTMo 

419972 AL041465 Hs.182982 golgin-67 lo-lo-hMo 

434804 AA649530 Hs.348148 gb:ns44f05.s1 NCLCGAP_Alv1 Homo sapiens lo-to-hi-to 

442832 AW206560 Hs.253569 ESTs lo-to-hi-to 

408660 AA525775 ESTs, Moderately similar to PC4259 fern lo-to-hi-to 

432674 AA641092 Hs.257339 ESTs, WeaWy similar to I38022 hypothec" lo-lo-hi-lo 

448150 M72167 ESTs to-to-hMo 

450468 AW379075 Hs.141742 Homo sapiens cDNA FU1221 1 fis, done MA lo-to-hi-to 

452874 AK001061 Hs.30925 hypothetical protein RJ 101 99 lo-lo-hMo 

412088 A1689496 Hs.108932 ESTs lo-lo-hMo 

443451 AI057404 Hs.58698 ESTs lo-to-hMo 

453853 AL040600 Hs. 188083 ESTs lo-to-hMo 

419863 AW952691 Hs.93485 Homo sapiens mRNA; cDNADKFZp76 101 91 (fr to-io-tii-to 

420729 AW964897 Hs.290825 ESTs lo-to-hi-to 

440801 AA906366 Hs. 190535 ESTs lo-lo-hMo 

407284 AJ539227 Hs.2 14039 hypothetical protein RJ23556 lo-to-hi-to 

428279 AA425310 Hs.155766 ESTs, WeaWy similar to A47582 &cefl gr to*lo-H-to 

436862 AI821940 ESTs, Moderately similar to ALU8JWMAN A to-lo-hWo 

432340 AA534222 gb:nj21d02.s1 NCLCGAP_AA1 Homo sapiens to-lo-hMo 

442048 AA974603 gb:op34f05.s1 Soares.NFUJ GBC S1 Homos lo-lo-hi-lo 

418781 T41160 Hs.8404 ESTs to-lo-hMo 

450642 R39773 Hs.7130 copinelV lo-to-hi-to 

451661 AB020650 Hs.26777 Homo sapiens, Similar to WAA0843 protei lo-lo-hi-lo 

435812 AA700439 Hs.188490 ESTs lo-to4u-to 

448065 AI459177 Hs.172759 ESTs, Moderately similar to Ai_U7_HUMAN A to-Jo-hMo 

453486 AL0392O1 Hs. 173554 ubiquinol-cytochromec reductase core or to-to4iMo 

414312 AA1S5694 Hs.191060 ESTs lo-to-hi-to 

438980 AW502384 gb:UI-HF-BROp-aka-M2-0-Ut.r1 NIR_MGC_5 lo-lo-hi-lo 

408001 AA046458 Hs.95296 ESTs lo-to-hMo 

421476 AW953805 Hs.21887 ESTs to-lo-hMo 

414426 D60745 Hs.25925 Homo sapiens, clone MGC:1 5393, mRNA, com lo-to-hi-to 

444563 N57Q57 Hs.284163 ANKHZN protein lo-lo-hi-lo 

418771 AA807881 Hs.25329 ESTs lo-lo-hi-lo 

417843 W07361 Hs.22545 Homo sapiens cDNA RJ1 2935 fis, done NT lo-lo-hi-lo 

415565 AA642449 Hs.48994 ESTs, Weakly similar to AF151600 1 CGJ-4 lo-lo-hMo 

419229 AI827237 Hs.282884 ESTs lo-to-hMo 

419905 AW248229 Hs.93659 protein disulfide isomerase related prot lo-lo-hi-lo 

452870 AW502761 Hs.30909 WAA0430 gene product to4o-hMo 

449059 AK000566 Hs.98135 hypothetical protein RJ2D559 Wo-hHo 

416157 NM_003243 Hs.342874 transforming growth factor, beta recepto" to-lo-hMo 

439305 AW393883 Hs.98968 hypothetical protein FU23058 lo-to-hi-to 

419235 AW470411 Hs^88433 neurotrimin lo-to-hMo 

416640 BE262478 Hs.79404 neuron-specific protein to-to-hMo 

434938 AW500718 Hs.8115 Homo sapiens, done MGG16169, mRNA, com to-lo-hi-to 

408177 AI241733 Hs.43871 ESTs lo-lo-hMo 

438459 T49300 Hs.35304 Homo sapiens cDNA FU 13655 fis, done PL lo-lo-hMo 

418381 AA682393 Hs. 11 9237 ESTs lo-l*4iMo 

432161 AK000400 Hs.341181 ESTs, Weakly similar to envelope [Hsapi lo-lo-hMo 

418283 S79895 Hs.83942 cathepsin K (pyenodysostosis) lo-to-hi-to 

421443 BE550141 Hs.156148 hypothetical protein RJ 13231 lo-to-hMo 
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416619 


AF013168 


Hs.79393 


tuberous sclerosis 1 


lo-lo-hWo 


449802 


AW901804 


Hs.23984 


hypothetical protein FU20147 


lo-lo-hi-lo 


446714 


W73818 


Hs. 110028 


ESTs 


|_ l_ i_; f_ 


413195 


AA1 27382 


Hs.22404 


protease, serine, 1 2 (neurotrypsin, moto 


1a In hi In 


438233 


W52448 


Hs.56t47 


COT*, 
to IS 


Inlnfii In 


416051 


AA835868 


Hs.25253 


mannosidase, alpha, class 1A, member 1 


In In ttl In 

io-io-nj-to 


438855 


AW946276 


Hs.6441 


Homo sapiens mKNA; cDNA DKFZp586J021 (rr 


10-iO-nHO 


425907 


AA365752 


1 i_ 4 fence 

Hs. 155965 


ESTs 


In In U In 

KHO-nMO 


451295 


AI557212 


Hs.17132 


ESTs, Moderately simuar to 154374 gene 


In tn ki In 


415443 


T07353 


Hs.7948 


ESTs 


In In hi In 

lo-io-nHO 


422366 


T83882 


Hs.97927 


ESTs 


lo-b-hWo 


435163 


M668884 


Hs.1 91 55 


ESTs 


lo-lo-hi-to 


426559 


AB001914 


Hs.1 70414 


paired basic amino acid cleaving system 


to-to-hMo 


448988 


Y09763 


Hs.22785 


gamma-aminobutyric acid (GABA) A recepto 


lo-lo-hMo 


453655 


AW960427 


Hs.342874 


transforming growth factor, beta recepto 


lo-lo-hi-lo 


414516 


A1307802 


Hs.1 35560 


ESTs, Weakly similar to T43458 hypolheti 


b-lo-hMo 


420028 


A8014680 


Hs.8786 


carbohydrate (N-acetyIglucosarrone-6-O) s 


lo-to-hWo 


430223 


NMJ002514 


Hs.235935 


nephroblastoma overexpressed gene 


lo-to-hi-to 


425887 


AL049443 


Hs.1 61 283 


Homo sapiens mRNA; cONA DKFZp586N2020 (f 


lo-lo-hWo 


442577 


AA292998 


Hs.1 63900 


ESTs 


lo-lo-hi-lo 


424940 


AA985308 


Hs.283902 


ESTs 


io-lo-hWo 


428839 


AI767756 


Hs.82302 


ft I _ • ^nAlA 1 A AtSA A f ^ _1 AIT 

Homo sapiens cDNA FU14814 fis, clone NT 


lo-lo-hi-to 


443868 


limn jnn 

W88483 


Hs.293650 


Homo sapiens mRNA for RGPR-p1 17, complet 




430334 


A1824719 


Hs.328700 


ESTs 


lo-lo-hi-lo 


439686 


W40445 


Hs.235857 


ESTs, Weakly similar to I38022 hypotheti 


lo-lo-hi-lo 


423754 


NM_016181 


Hs.1 32526 


melanoma antigen 


lo-lo-hMo 


415205 


H71616 


Hs.135233 


ESTs 


lo-to-hMo 


426413 


AA377823 




gb:EST90805 Synovia) sarcoma Homo sapien 


Io4o-hi4o 


407204 


R41933 


Hs.1 40237 


ESTs, Weakly similar to ALULHUMAN ALU S 


lo-to-hi-to 


430234 


N29317 


Hs.236463 


KIM1 238 protein 


lo-lo-hi-lo 


437143 


AW204056 


Hs.8917 


ESTs 


to-lo-hi-hi 


445162 


AB011131 


Hs.1 2376 


piccolo (presynaptic cytomatrix protein) 


to-to-M-hi 


415083 


AI632683 


Hs.27179 


Homo sapiens cDNA FU12933 fis, clone NT 


lo-lo-hi-hi 


442924 


AA533513 


Hs.93659 


protein disulfide isomerase related prof 


to-lo-hi-hi 


429536 


AA873016 


Hs.206097 


oncogene TC21 


to-lo-hi-hi 


458584 


AF217518 


Hs.324136 


PTD012 protein 


lo-lo-hi-hi 


419647 


AA348947 


Hs.91816 


hypothetical protein 


to-lo-hi-hi 


427201 


AB037860 


Hs.1 73933 


nuclear factor l/A 


lo-lo-hi-hi 


428030 


AI915228 


Hs.1 1493 


Homo sapiens cDNA FU1 3536 fis, clone PL 


lo-lo-hi-hi 


411779 


AA292811 


Hs.72050 


non-metastatic cells 5, protein expresse 


lo-lo-hi-hi 


442482 


N^014039 


Hs.8360 


PTD01 2 protein 


lo-lo-hi-hi 


417458 


NM_005655 


Hs.82173 


TGFB inducible early growth response 


lo-lo-hi-hi 


438021 


AV653790 


Hs.324275 


WW domain-containing protein 1 


to-lo-hi-hi 


409799 


01 1928 


Hs.76845 


phosphoserine phosphatase-like 


to-lo-hi-hi 


440676 


NMJJ04987 


Hs.1 12378 


UM and senescent cefl anfigen-Bke doma 


lo-lo-hi-hi 


421437 


AW821252 


Hs.1 04336 


hypothetical protein 


lo-lo-hi-hi 


456362 


AW973003 


Hs.1 79909 


hypothetical protein FU22995 


lo-lo-hi-hi 


407686 


AW901268 


Hs.1 26043 


chromosome 21 open reading frame 51 


lo-lo-hi-hi 


431129 


AL1 37751 


Hs.263671 


Homo sapiens mFtNA; cDNA DKFZp434l0812 (f 


lo-lo-hi-hi 


431874 


AW610031 


Hs.323914 


translocase of inner mitochondrial membr 


to-lo-hi-hi 


448072 


AI459306 


Hs.24908 


ESTs 


lo-lo-hi-hi 


436860 


H12751 


Hs.5327 


PR01 914 protein 


to-lo-hi-hi 


448770 


AA326683 


Hs.21992 


likely ortholog of mouse variant polyade 


lo-lo-hi-hi 


428044 


AA093322 


Hs.301404 


RMA binding motif protein 3 


lo-lo-hi-hi 


451468 


AW503398 


Hs.293663 


ESTs, Moderately similar to 138022 hypot 


lo-lo-hi-hi 


440278 


BE560B7O 


Hs.9052 


ESTs, Weakly similar to 2004399A chromos 


lo-lo-hi-hi 


441102 


AA973905 




intermediate filament protein syncdlin 


lo-lo-hi-hi 


423942 


AF209704 


Hs. 135723 


glycolipid transfer protein 


In In hi 111 


425254 


U91985 


Hs.1 05658 


ONA fragmentation factor, 45 kO, alpha p 


lo-lo-hi-hi 


409324 


W76202 


Hs.343812 


lipoic acid synthetase 


lo-lo-hi-hi 


431707 


R21326 


Hs.267905 


hypothetical protein FU10422 


lo-lo-hi-hi 


423335 


AB018337 


Hs.1 27287 


K1AA0794 protein 


lo-lo-hi-hi 


429200 


AA447871 


Hs.1 94215 


ESTs, Weakly similar to 138022 hypotheti 


lo-lo-hi-hi 


429898 


AW1 17322 


Hs.42366 


ESTs 


lo-lo-hi-hi 


409604 


AW444448 


Hs.49124 


ESTs 


lo-lo-hi-hi 


431797 


BE169641 


Hs.270134 


hypothetical protein FU20280 


lo-to-hi-hi 


437576 


BE514383 




prothymosin, alpha (gene sequence 28) 


In In hi h! 


415992 


C05837 


Hs.145807 


hypothetical protein FU 13593 


lo-to-hi-hi 


458537 


W24704 


Hs.54773 


ESTs 


lo-lo-hi-hi 


417665 


At A IDCIO CO 

AW852850 


Hs.22662 


ESTs 




422292 


AI815733 


Hs.1 14360 


transforming growth factor beta-stimulat 


In In It! hi 

kwo-ru-fii 


421501 


M29971 


Hs.1 384 


s\ n _.il.J.....U. nil A - _ ii — >i * 

O-o-methylguanme-DNA metnyl transferase 


In In V.! Wi 

lo-io-ni-ni 


457952 






numan cnroniosurns l #4* i inruw uone iiwd. i 




414630 


8E410857 


Hs.16064 


gb:601301177F1 NiH_MGC_21 Homo sapiens c 


to-to-hMtf 


421990 


T31811 


Hs.110480 


DC1 2 protein 


lo-lo-hi-hi 


404956 






C100321(T:gil6912582|reflNP_036524.1| pe 


lo-lo-hi-hi 


436829 


AW297958 


Hs.163109 


ESTs 


lo-lo-hi-hi 


402106 


AK002178 




hypothetical protein FU11316 


to-lo-hi-hi 


404384 






NM_020632*:Homo sapiens ATPase, Hf»-tra 


lo-to-hi-hi 


445123 


AI762911 


Hs.145369 


ESTs 


lo-lo-hPii 


401757 






Target Exon 


to-lo-hi-hi 


439502 


AA836672 


Hs.130694 


ESTs 


toWihhi 
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AI015709 



AA379658 Hs.272759 

AL036166 Hs.323378 

AF227156 Hs.110103 

BE081857 Hs.94211 

AW246428 NsJ5355 

JC3626 Hs.2057 



AL046508 Hs.270607 

W814B6 Hs.58648 

AI338952 Hs.32194 

AW892432 Hs.65307 

T03890 Hs.157208 
NWL014115 

AA100529 Hs.286232 
AA355749 

AW161061 Hs.62954 



AW850818 

AK000626 

AW898791 

AJ132560 

BE389229 

H19480 

AI792302 

R84694 

T80795 



Hs.16230 
Hs.118837 

Hs.30954 

Hs.268787 

Hs.248141 

Hs.79194 

Hs.193702 



400111 
405446 
401563 
402786 
426484 
414343 
421970 
422592 
413431 
426746 
400237 
402532 
402396 

459649 AW298364 Hs.289292 
401512 
448622 
400501 
452324 
453146 
430445 
401750 
435236 
400375 
412151 
410498 
405044 
413169 
402101 
455019 
446826 
412180 
407273 
452895 
416117 
430934 
416309 
444578 
401966 
444850 
403885 
405435 
422694 
422912 
412748 
403704 
440507 
405503 
456123 
454261 
458956 
418367 
444553 
405811 
429461 
423378 
458516 
404039 
454148 
412678 
449298 
405525 
424576 
451601 
422395 
434333 
413509 
419504 
448586 
401209 
423554 
439803 
424593 
408122 
409958 
408214 
421911 
407813 
425211 
442772 
419733 
428260 
427083 



AW444882 Hs.148483 



C06003 Hs.23782 

AW405973 Hs.11637 

BE083158 Hs.10862 

H06994 

R00602 

AF216077 Hs.48376 
BE220675 

AA326035 Hs.59236 

AI167530 Hs.149380 

A11B8219 Hs.99311 

BE313601 Hs.164866 

BE010749 Hs.255097 

AW732837 Hs.42390 

AA115575 Hs.114914 

AI911333 Hs.171689 

BE154142 Hs.95833 

N92100 Hs.97437 

AA310177 Hs.103931 

AA186733 Hs.292154 



BE145419 
A1088585 



Hs.118904 



AF285120 Hs.283734 



M90516 

AA001021 

AA343729 

AI432652 

NM.001523 

AL120445 

AL041520 

AL120247 

M18667 

AW503680 

AW362955 

AW290886 

NMJJQ6363 



Hs.1674 
Hs.6685 

Hs.42824 
Hs.57697 
Hs.77823 

Hs.40109 

Hs.1867 

Hs.5957 

Hs.224961 

Hs.86999 

Hs.173497 



Eos Control 

Homo sapiens mRNA; cDNA OKFZp586l2022 (f 

C15001262rgiI7304981|ref|NP_038528. 1| ca 

C1000887*:gi|12732453lrefIXP - 01 1474.11 C 

KIAA1457 protein 

coated vesicle membrane protein 

RNA polymerase ! transcription factor RR 

rcdl (required for ceil differentiation, 

ubiquitin-conjugating enzyme E2N {homoio 

uridine monophosphate synthetase (orotat 

NMJX) 1087*: Homo sapiens angio-associated 

Target Exon 

Target Exon 

ESTs 

NM_01408O:Homo sapiens dual oxidaseJike 

ESTs, Weakly similar to STK2JHUMAN SERIN 

ENSP00000251912*:WAA1617 protein (Fragm 

ESTs 

ESTs 

ESTs 

NM_012448*:Homo sapiens signal transduce 
ESTs, Highly similar to ARX MOUSE HOMEOB 
NNL0141 15*:Homo sapiens PRO0113 protein 
Homo sapiens cDNA: RJ23190 fis, clone L 
gb:EST64459 Jurkat T-cells VI Homo sapie 
NM_014630*:Homo sapiens KIAA0211 gene pr 
ESTs, Weakly similar to zinc finger prot 
ENSP00000217725*:Laminin aipha-1 chain p 
gb:IL3-CT0220-091199-026-A03 CT0220 Homo 
hypothetica] protein FU20619 
gb:CM0-NN0075-1304Q0-332-f06 NN0075 Homo 
gb:Homo sapiens mRNA for immunoblabulin 
phosphomevatonate kinase 
ESTs 

potassium inwardly-rectifying channel, s 
cAMP responsive element binding protein 
ESTs 

C17000574:gi|8923190|reflNP_060178.1| hy 
ESTs 

Target Exon 
Target Exon 

hypothetical protein FU 12847 
ESTs 

Homo sapiens cDNA- FU233 13 fis, done H 
Target Exon 

gb:yi81b07.r1 Soares infant brain 1N1B H 
C7000609*:gi|628012|pir||A53933 myosin I 
gb:ye74c04.r1 Soares fetal Dver spleen 
Homo sapiens clone HB-2 mRNA sequence 
gb:ht98f11 jc1 NCI_CGAP_Lu24 Homo sapiens 
hypothetical protein DKFZp434L0718 
ESTs 

NM_024810:Homo sapiens hypothetical prot 
ESTs, Weakly similar to HSJ2.HUMAN DNAJ 
hypothetical protein FU22558 
ESTs 

ENSP00000247650*:Hypothetical 177.6 kDa 
nasopharyngeal carcinoma susceptibility 
ESTs 
ESTs 

NMj002439*:Homo sapiens mulS (E. coii) h 
ESTs 

centrosoma! protein 1 
DKFZP434B0335 protein 
stromal cell protein 

gb:lL5-HT0198-291099-009-E01 HT0198 Homo 
ESTs 

CGI-204 protein 

C12000519:giI7710046|ref|NP_057914.1| Id 
gJutamine-mjctose-6-phosphate transamin 
thyroid hormone receptor interactor 8 
gb:EST49730 Gall bladder 1 Homo sapiens 
hypothetical protein FU10718 
hyaturonan synthase 1 
hypothetical protein FU21343 
gb:DKFZp434G2317_s1 434 (synonym: htes3) 
K1AA0872 protein 
progastricsin (pepsinogen C) 
Homo sapiens clone 24416 mRNA sequence 
Homo sapiens cDNA RJ14415 fis, done HE 
ESTs. Weakly similar to S65657 alpha-1 C- 
Sec23 (S. cerevisiae) homotog B 



lo-lo-hi-hi 
to-to-hi-hi 
toJo-hPu 
lo-lo-hUti 
lo-lo-hi-hi 
to-to-hi-hi 
to-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
fo-io-hi-hi 
lo-to-hi-hi 
lo-to-hi-hi 
lo-to-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
loAolMi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-to-hi-hi 
lo-to-hi-hi 
lo-to-hi-hi 
lo-lo-hi-hi 
lo-to-hi-hi 
fo-to-hi-til 
lo-lo-hi-hi 
lo-to-hi-hi 
lo-to-hi-hi 
to-to-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-to-hi-hi 
lo-lo-hi-hi 
lo-to-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
to-to-hi-hi 
to-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
to-to-hi-hi 
lo-lo-hi-hi 
to-to-hi-hi 
to-to-hi-hi 
lo-to-hi-hi 
lo-lo-hi-hi 
lo-to-hi-hi 
to-to-hi-hf 
lo-lo-hi-hi 
to-to-hi-hi 
lo-lo-hi-hi 
lo-to-hi-hi 
lo-to-hi-hi 
to-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
to-to-hi-hi 
lo-to-hi-hi 
lo-lo-hi-hi 
lo-to-hi-hi 
lo-lo-hi-hi 
lo-to-hi-hi 
fo-to-hMu 
to-to-hi-hi 
lo-lo-hi-hi 
lo-to-hi-hi 
lo-to-hMii 
lo-lo-hi-hi 
lo-lo-hi-hi 
to-io-hMii 
to-lo-hi-hi 
lo-lo-hi-hi 
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418583 AA604379 Hs.86211 hypolhefcal protein lo-to-hi-hi 

407355 AA846203 Hs.193974 ESTs, Weakly similar to ALU 1 HUMAN ALU S lo-lo-hi-hi 

454003 AA058944 Hs.116602 Homo sapiens, done IMAGE41540O8, mRNA, lo-lo-hi-hi 

425322 U63630 Hs.155637 protein kinase, DNA-ac(ivated, catalytic lo-to-hi-hi 

5 402240 Target Exon lo-lo-hi-hi 

421867 AA481078 Hs. 109045 hypothetical protein RJ 10498 lo-lo-hi-hi 

408603 R25283 Hs.325416 Homo sapiens mRNA; cDNA DKFZp564H 1916 (f lo-to-hi-hi 

437389 AL359587 Hs.271586 hypotheOcal protein DKFZp762M1 15 to-lo-hi-hi 

, _ 457148 AF091035 Hs. 184627 KIAA01 18 protein lo-to-hi-hi 

1 0 400277 Eos Control lo-to-hi-hi 

400995 C11000295*:gi!12737279lreflXP_012163.1| lo-to-hi-hi 

400818 Target Exon to-lo-hi-hi 

402758 Cl001899*:gi|12722636|reflXP.01067Z1| e lo-lo-hi-hi 

403708 Target Exon lo-lo-hi-hi 

15 405610 ENSP00000241065*:CONA lo-to-hUii 

414242 AA749230 Hs.26433 dofichyl-phosphate (UDP-^cetyfgfuccsam Mo-hWu 

420757 X78592 Hs.99915 androgen receptor (dihydrotestosterone r lo-lo-hi-hi 

400965 C1 1002190»:gi|12737279|reflXP.01 21 63.1| lo-to-hi-hi 

401192 Target Exon lo-fo-fu-hi 

20 404407 Target Exon lo-to-hi-hi 

401405 Target Exon lo-lo-hi-hi 

403055 C2002219*:gi|12737280|refIXP_00668Z2| k lo-lo-hi-hi 

404661 C900030S*:gi[12737280|ref|XP_00668Z2|k lo-lo-hi-hi 

433627 AF078866 Hs.284296 Homo sapiens cDNA: FU 22993 fis, clone K b-lo-hUii 

25 410204 AJ243425 Hs.326035 early growth response 1 lo-to-hHii 

432642 BE297635 Hs.3069 heat shock 70kD protein 9B (mortaIin-2) lo-to-hi-hi 

400769 Target Exon lo-lo-hi-hi 

433980 AA137152 Hs.286049 phosphoserine aminotransferase lo-lo-hi-hi 

403725 Target Exon lo-lo-hi-hi 

30 413587 AA156164 Hs.286241 protein kinase. cAMP-dependent, regulato lo-lo-hi-hi 

422614 AI908006 Hs.295362 Homo sapiens cDNA RJ 14459 fis, clone HE to-lc-hMii 

400275 NMJ)06513*:Homo sapiens seryMRNA synth lo-lo-hi-hi 

402810 NM_004930*:Homo sapiens capping protein lo-lo-hi-hi 

452049 BE268289 Hs.27693 peptidylprdyl isomerase {cydophiUn}-! b-lo-hi-hi 

35 445677 H96577 Hs.6838 ras homoiog gene family, member E b-lo-hi-hi 

428770 AK001667 Hs.193128 hypothetical protein FU 10805 lo-lo-hi-hi 

426403 AI3S3048 Hs.326159 leucine rich repeat (in FUr) interactin lo-to-hi-hi 

434647 W74158 Hs.103189 ^polysaccharide specific response-68 to-to4iMii 

402807 ENSP00000235229:SEMR lo-to-hi-hi 

40 413992 W26276 Hs.136075 RNA, U2 small nuclear lo-to-hi-hi 

407191 AA608751 gb:ae56h07.s1 Stratagene lung carcinoma to4o-hi-to 

403328 Target Exon b-lo-hi-hi 

411984 NMJ305419 Hs.72988 signal transducer and activator of trans to-Io-hMo 

. 451017 BE391847 Hs.181173 hypothetical protein MGC1 0771 lo-lo-hi-hi 

45 404108 C7000911*:gi|4235142|gblAAD14470.1I(AC0 to-lo-hi-hi 

407819 R42185 Hs.102720 ESTs lo-to-hi-hi 

435876 AWB12586 Hs.160271 G protein-coupled receptor 48 lo-lo-hi-lo 

436716 AI433540 gb:069g05jc1 NCLCGAP_Wd1 1 Homo sapien lo-lo-hi-hi 

401419 Target Exon lo-lo-hi-hi 

50 424363 AW512144 Hs.346947 ESTs, Weakly similar to A48809 carboxyle lo-lo-hi-hi 

408866 AW292096 Hs.255036 ESTs to-lo-hi-hi 

415516 F1 141 1 gb:HSC2WF081 normalized infant brain cDN lo-lo-hi-hi 

423144 AW851527 Hs.253677 ESTs, Weakly similar to I38022 hypotheU lo-to-hi-hi 

452560 BE077084 Hs.99969 ESTs toto-hWii 

55 439827 AA846538 Hs.1 87389 ESTs to-lo-hi-hi 

419709 AA255592 Hs.347973 ESTs, Weakly similar to alternatively sp lo-to-hi-hi 

413672 BE156536 gb:QVO-HT0368-310100-091-h10 HT0368 Homo lo-to-hi-hi 

425291 AA354572 gb:EST62857 Jurkat T-celis V Homo sapien lo-to-hi-W 

427403 AA402107 Hs.257146 ESTs, Moderately similar to I38022 hypot lo-lo-hi-hi 

60 430911 AW937461 Hs.255377 ESTs lo-lohWii 

435293 A1040777 Hs.117170 ESTs to-lo-hUii 

448490 AI523897 Hs.271692 ESTs, Weakly similar to 138022 hypotheli lolo-hi-hi 

449539 W80363 Hs.58446 ESTs lo-to-hi-hi 

458082 AW978811 Hs.314451 ESTs, Weakly similar to ALU1_HUMAN ALU S lo-lo-hi-hi 

65 459407 N92114 gb:za22!i11.r1 Soares fetal liver spleen lo-to-hi-hi 

423231 AA323486 Hs.271273 Homo sapiens cDNA FU1 2335 fts, clone MA Jo-to-hi-hi 

45082B AW382884 Hs.204715 ESTs to-to-hi-hi 

411690 AA669253 Hs.136075 RNA, U2 small nuclear bM4ii 

„ 414739 U83867 Hs.77196 spectrin, alpha, non-erythrocytic 1 (alp lo-lc-hi-hi 

70 444169 AV648170 Hs.58756 ESTs to-lo-hi-hj 

420911 U77413 Hs. 100293 O-tinked N-acetylglucosamine (GlcNAc) tr lo-lo-hi-hi 

422195 AB007903 Hs.113082 WAA0443 gene product lo-to-hMii 

452704 AA027823 Hs.149424 Homo sapiens PNAS-130 mRNA, complete cds foAolM 

425074 AA495930 Homo sapiens cONA: FU 221 65 fe, clone H lo-lo-hi-hi 

75 426376 N46752 Hs.302985 ESTs tc-tc-hi-hi 

447754 AW073310 Hs.163533 Homo sapiens cDNA FU 141 42 fis. done MA to-to-hi-hi 

413686 AI469213 Hs.71404 ESTs lo-iohUn 

449000 U69560 Hs.3826 kelcWike protein C31P1 to4o-hi-hi 

430064 AK000091 Hs.231436 hypothetical protein FU20084 lo-to-hi-hi 

80 412205 N33818 Hs.20274 ESTs, Weakly similar to unnamed protein lo-lo-hWii 

423955 AI420582 Hs. 1361 64 cutaneous T-cefl lymphoma-assodated turn lo-lo-hi-hi 

455619 BE063853 gb:QV3-BT029641 1 299-022-g09 BT0296 Homo lo-to-hi-hi 
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408722 
459710 
417918 
402964 
424387 
427220 
410451 
400713 
407218 
449312 
419612 
455272 
401839 
440422 
436819 
413644 
413939 
448198 
450488 
433507 



442789 
407251 
409051 
409123 
416225 
433735 
434404 
446667 
447982 
438890 
427882 
459680 
416632 
453876 
414528 
419902 
409542 
433560 
447499 
435023 
412156 
414505 
404277 
414662 
444430 
445612 
403739 
403740 
411084 
429143 
443060 
422749 
429441 
414382 
441560 
446106 
452239 
446874 
412795 
430325 
426392 
447448 
415743 
431607 
411979 
453620 
431099 
421687 
439565 
442349 
410096 
429447 
431802 
441715 
458230 
428788 
450818 
4i9576 
400401 
427004 
401178 



Ammo 

AJ701596 
AA209205 

AI739312 
AF069517 
BE065687 

AA095473 
N71673 
AI498267 
BE148152 

AW452696 

AA731746 

BE154910 

AL047051 

BE622100 

AA009999 

AI817336 

AW748336 

AW904361 

U67611 

AAQ80912 

AA063403 

AA577730 

AA608955 

AW445034 

BE161878 

H22953 

AA827756 

AA640987 

H96982 

H69480 

AW021748 

AA148950 

AA804409 

AA503020 

AI925195 

AW262580 

AI692552 

H29487 

R45389 

AL036058 
A1611153 
N94126 



Hs238i02 
Hs.121592 
Hs.163754 

Hs.284163 
Hs.173993 



Hs.28505 
Hs.223666 
Hs.1 10613 



Hs.130760 

Hs.120232 

Hs.278793 

Hs.199961 

Hs.209406 

Hs.59159 

Hs.19179f 

Hs.1 10613 

Hs.131191 



Hs. 188684 
Hs.1 09653 
Hs.256578 
Hs.224805 
Hs.137551 
Hs.135049 
Hs.193767 
Hs.42321 
Hs.141304 
Hs.1 10406 
Hs.188836 
Hs.1 18920 
Hs.36563 
Hs.130891 
Hs.147674 

Hs.17110 
Hs.23558 

Hs.76807 
Hs.6093 
Hs.1 2969 



T18987 
AA333327 
D78874 
W01076 
AJ224172 
AW380339 
F13386 
M377165 
AW379378 
AW968304 
BE241753 
AF004562 
AW968324 
BE244285 
AA167664 
AB033097 
X85134 
BE396163 
Y13367 
AL035306 
AF086386 
W40516 
AW245200 
AW812452 
AL133570 
AI929453 
BE311851 
AF082283 
AI740573 
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Hs.197335 
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Hs.278573 
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Hs.8068 

Hs.7888 

Hs.44833 

Hs.170121 

Hs.56156 

Hs.74592 

Hs.239356 

Hs.17384 

Hs.14333 
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Hs.72984 

Hs.25005 

Hs.249235 
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Hs.145599 
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Hs.193516 

Hs.142827 

Hs.91251 

Hsl213107 



ESTs 
ESTs 

hypolhefcal protein RJ12606 
NMJ)22095*:HQmo sapiens hypothetical C2H 
ANKHZN protein 
RNA binding motif protein 6 
gb:RC3-BT0316-27O40M16-ft0 BT0316 Homo 
NMJH)6165*:Homo sapiens nuclear factor r 
uWquitin-conjugaling enzyme E2H {homofo 

WAA0421 protein 

gb:RC4.HT0231-041199-012-b04HT0231 Homo 
NM_005t77*:Homo sapiens ATPase, H+ trans 
myosin phosphatase, target subunit 2 
ESTs 

ESTs, Weakly similar to Z195_HUMAN ZINC 
ESTs, Weakly similar to ALU7_HUMAN ALU S 
ESTs, Weakly similar to I38600 zinc fing 
ESTs, Moderately simflar to HPV16 E1 pro 
ESTs 

KIAA0421 protein 

ESTs, Weakly similar to ALU7_HUMAN ALU S 
transaldoiase 1 

gbzn04d03.r1 Stratagene hNT neuron (937 

gb:zrn04d12.s1 Stratagene corneal stroma 

ESTs, Weakly simflar to PC4259 femfin 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs, Weakly similar to ALU7_HUMAN ALU S 

ESTs 

ESTs 

ESTs 

ESTs, Weakly similar to 138022 hypotheti 

ESTs 

ESTs 

hypothetical protein FLJ22418 
hypothetical protein M6C4400 
protocadherin beta 16 

gb.-wd73f12jc1 NCJ_CGAP_Lu24 Homo sapiens 
Homo sapiens mRNA; cDNA DKFZp434C2016 (f 
ESTs, Weakly similar to A48042 lysosomal 
NMJ)191 1 1*:Homo sapiens major histocompa 
major histocompatibility complex, dass 
Homo sapiens cDNA: FU 22783 fis, done K 
hypothetical protein 

ENSP00000251563*:UDP-glucuronosyltransfe 
NMJ)01076*:Homo sapiens UDP gfycosytoan 
ESTs, Moderately similar to KIAA0877 pro 
plasma glutamate carboxypeptidase 
procollagen C-endopeptidase enhancer 2 



tipophilin B (uteroglobin family member) 
hematopoietic PBX-interacting protein 
Homo sapiens done 23736 mRNA sequence 
ESTs 

protein tyrosine phosphatase, receptor t 
ESTs 

sperial AT-nch sequence binding protein 

syntaxin binding protein 1 

ESTs 

F-box only protein 29 

ESTs, Weakly similar to 2195 HUMAN ZINC 

WAA1271 protein 

retinoblastoma-bindtng protein 5 

ESTs, Weakly similar to ALU5_HUMAN ALU S 

phospboinosiGde-3-kinase, dass 2, ajph 

hypothetical protein MGC14797 

ESTs 

Homo sapiens cDNA: RJ221 19 fis, done H 
hypothetical protein MGC5540 
ESTs, Weakly similar to S14747 sphingomy 
Homo sapiens mRNA; cDNA DKFZp434L201 (fr 
Homo sapiens cDNA RJ 13289 fis, done OV 
WAA1624 protein 
B-ceB CLUympftoma 10 
P311 protein 

hypothetical protein FUl 1198 

Homo sapiens endogenous retrovirus RAN1 

ESTs 

RNA binding motif protein, X chromosome 



lo-to-hi-to" 

to-lo-hi-hi 

lo-lo-hi-hi 

lo-lo-hi-hi 

to-to-hi-hi 

lo-lo-hi-hi 

to-lo-hi-hi 

lo-lo-hi-hi 

to-to-hi-ni 

to-lo-hi-hi 

to-lo-hi-hi 

lo-lo-hi-hi 

lo-lo-hi-hi 

lo-lo-hi-hi 

lo-lo-hi-hi 

lo-lo-hi-hi 

lo-lo-hi-hi 

lo-lo-hi-hi 

lo-lo-hi-hi 

lo-lo-hWo 

to-to-hi-lo 

lo-lo-hi-lo 

lo-to-hMo 

lo-lo-hi-lo 

to-to-hi-lo 

lo-to-hMo- 

to-lo-hMo 

to-to-W-to 

lo-lo-hi-lo 

Jo-to-hi-to 

lo-lo-hi-lo 

lo-lo-hi-lo 

to-to-hi-lo 

lo-to-hMo 

lo-lo-hi-lo 

to-Io-hi-to 

lo-lo-hi-lo 

lo-lo-hi-lo 

lo-lo-hi-lo 

lo-to-hi-to 

lo-to-hMo 

lo-lo-hi-lo 

lo-lo-hi-to 

lo-lo-hi-lo 

lo-to-hi-fo 

lo-lo-hWo 

lo-to-hi-to 

lo-lo-hi-lo 

h4o~hUo 

lo-lo-hMo 

to-to-hWo 

lo-to-hMo 

lo-lo-hi-lo 

lo-lo-hMo 

lo-lo-hi-lo 

to-to-hi-lo 

lo-to-hi-to 

lo-lo-hi-to 
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lo-lo-hi-lo 

lo-lo-hi-to 

lo-lo-hWo 

to-to-hMo 

lo-to-hMo 
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423749 U09848 Hs.132390 zinc finger protein 36 (KOX 18) lo4o-hMo 

428898 AB033070 Hs.1944G8 KIAA1 244 protein to-to-hMo 

458258 AW406546 Hs.127971 ESTs IcMo-hMo 

429521 BE048708 Hs.50949 ESTs to-to-hMo 

5 402185 Target Exon lo-io-hMo 

415961 H10983 Hs.155919 ESTs to-to-hMo 

457265 AB023212 Hs.225967 WAA0995 protein lo-to-hMo 

412419 AW948630 gb: GVO-FT0001 -050500-226-g05 FT0001 Homo to-to-hMo 

438397 AA806478 , Hs. 123206 ESTs to-to-hMo 

10 440509 BE410132 Hs.134202 ESTs, WeaWy similar to T17279 hypotheti lo-to-hMo 

423895 AA332215 gb:EST36124 Embryo, 8 week I Homo sapten lo-to-hi-to 

400251 NM_004651*:Homo sapiens ubiquifin sperif lo-to-hMo 

445094 AW296163 Hs.147296 ESTs lo-toW-to 

432323 AK001409 Hs.274356 hypolheScal protein FU 10547 to-to-hMo 

15 444290 AA262496 gb:zs20f1U1 NCLCGAP_GCB1 Homo sapiens lo-to-hMo 

435803 Z44194 Hs.4994 transducer of ER8B2, 2 to-to-hMo 

436905 N31273 Hs.42380 ESTs to-to-hMo 

401849 Target Exon lo-Jo-hi-to 

402249 C19000553*:gi|12741444lref|XP.00888B.2i lo-to-hMo 

20 406180 AB018249 small inducible cytokine subfamily A (Cy lo-lo-hMo 

448176 AJ672546 Hs.170507 ESTs to-kMiMo 

409259 AW608930 Hs.52184 hypothetical protein RJ 206 18 to-to-hMo 

457335 AW969834 Hs. 303303 ESTs lo-to-hMo 

452444 BE144022 gb:MR0-HT01 65-191 199-004-f05 HT0165 Homo lo-to-hi-lo 

25 405429 Target Exon b-io-hi-io 

430103 AA465259 gb:aa33b03.M NQ^CGAP.GCBI Homo sapiens lo-to-hMo 

439944 AA856767 Hs.1 24623 ESTs lo-to-hMo 

411283 AW852754 gb:PM1-CT0247- 1 801 0f>009^05 CT0247 Homo toto-hMo 

_ A 458195 R10085 Hs.130370 ESTs lo-to-hMo 

3 O 452654 BE004783 gb:MR2-BN01 14-270400-004-e1 1 BN01 14 Homo lo-to-hi-to 

425684 AF000989 Hs. 159201 thymosin, beta 4, Y chromosome to-to-hMo 

429452 A1949495 Hs.133998 Homo sapiens cDNA RJ 13202 fis. clone NT lo-to-hi-to 

431709 AF220185 Hs.267923 uncharacterized hypothalamus protein HT0 to-to-hMo 

411701 BE181659 gb:QV1-HT0638-070500-191-g07HT0638 Homo lo-to-hMo 

35 430729 M572560 Hs.301283 KJAA0793 gene product loJo-hMo 

447476 BE293466 Hs.20880 ESTs, Weakly similar to I38022 hypotheti lo-to-hMo 

450436 AW293661 Hs.131887 ESTs lo-to-hMo 

405365 CX001212*:gil786l932|gb(AAF70445.1| (AF2 lo-lo-hMo 

419555 AA244416 gb:nc07d11.s1 NCLCGAP_Pr1 Homo sapiens lo-to-hi-to 

40 446103 U90918 Hs.13804 hypothetical protein dJ462023.2 Ic-to-hMo 

400986 NM_024085*:Homo sapiens hypothetical pro lo-lo-hMo 

424194 BE245833 Hs.169854 gb:TCBAP1Et908 Pediatric pre-B cell acut to-to-hMo 

400210 Eos Control to-to-hMo 

400234 NM_005336:Homo sapiens high density lipo lo-lo-hMo 

45 400235 NM_005336:Homo sapiens high density lipo to-to-hMo 

405387 NWL022170*:Homo sapiens Wilfiams-Beuren lo-to-hMo 

433075 NM_002959 sortiiinl lo-to-hMo 

406302 C16000922:gil7499103|pirHT20903 hypothe lo-to-hi-to 

428181 AA423976 gb:zv62n06.s1 SoaresJesS5_NHT Homo sap lo-to-hMo 

50 456629 AW891965 Hs.279789 histone deacetytase 3 lo-lo-hMo 

426940 AA393537 Hs.98347 ESTs, Weakty similar to JC5308 testis-sp lo-to-hMo 

433555 AA535902 Hs.146211 Homo sapiens HERC2P7 pseudogene, partial lo-to-hi-to 

421431 AA650117 Hs.283107 ESTs to-to-hMo 

448631 AI554923 gb:ie53M2.x1 Soares_NFL_T__GBC_S1 Homos to-lo-hMo 

55 433521 T66087 Hs.1 12482 Homo sapiens unknown mRNA sequence to-to-hMo 

407187 AA446971 gb:zw85f11.s"l SoaresJotalJelusJylb2HF8_. kMc-hMo 

450739 AI732707 Hs.116506 ESTs, Weakly similar to ALU7_HUMAN ALU S lo-lo-hMo 

440004 BE397117 Hs.120824 hypothetical protein RJ21 845 kMo-hMo 

403947 NMJJ05032 plastin 3 (Tisoform) lo-to-hMo 

60 405529 AW410458 chromosome 11 open reading frame2 lo-to-hi-to 

402163 C19001075*:giJ4567179lgblAAD23607.1|ACOO IcMo-hMo 

404663 ENSP00000251884:K1AA1521 protein (Fragme to-lo-hMo 

400220 Eos Control to-to-hMo 

401444 Target Exon to-to-hMo 

65 455824 BE143703 gb:MRf>HT0164-191 199^004-103 HT01 64 Homo to-to-hMo 

400206 Eos Control to-lo-hMo 

458659 AW749895 Hs.332520 Homo sapiens mRNA; cDNA DKFZp434A1 014 (f to-to-hMo 

428666 AU080190 Hs.1B9242 Homo sapiens mRNA; cDNA DKFZp434A202 (fr lo-lo-hMo 

„ 428442 AA428638 Hs.98606 ESTs Ic-to-hMo 

70 440151 AA868167 gb:ak38e07.s1 Soares_testis_NHT Homo sap to-to-hMo 

431046 AW854382 Hs.249126 Homo sapiens done 24894 mRNA sequence lo-lo-hMo 

443914 AIG91173 Hs.222362 ESTs, Weakly similar to p40 (H.sapiensJ to-to-hMo 

402469 Target Exon toto-hMo 

418155 R45481 Hs.23719 ESTs, Weakly similar to I38022 hypotheti to-to*Mo 

75 446893 A1610818 Hs.7110 ESTs lo-lo-hi-to 

442336 AW340958 Hs.7572 ESTs to-to-hMo 

421290 NM_014368 Hs.103137 UM homeobox protein 6 to-to-hMo 

450374 AA397540 Hs.60293 Homo sapiens done 122482 unknown mRNA lo-to-hMo 

402347 Target Exon to-to-hMo 

80 415184 AA380436 Hs.211973 homolog of Yeast RRP4 (ribosomal RNA pro to-to-hMo 

415632 U67085 Hs.78524 TcD37 homolog kMo-hMo 

423718 AL119520 Hs.1 80737 Homo sapiens done 23664 and 23905 mRNA lo-to-hMo 
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449140 AW01384O Hs.202092 ESTs lo-lo-hMo 

431241 AA496799 Hs.36958 ESTs lo-lohMo 

416631 H69466 gb7r88fO7.il Soares fela! Over spleen lo-to-hMo 

424168 L29277 Hs.321677 signal transducer and activator of trans to-to-hMo 

5 401600 BE247275 U5 snRNP-specffic protein, 116 kD lo-to-hMo 

420588 AF000982 Hs.147916 OEAO/H (Asp-Qu-AIa-Asp/His) box polypep to-to4ii-lo 

414111 BE047679 Hs.152982 hypotheBcal protein FU 131 17 to-to-hMo 

417138 AA193646 Hs. 65771 Homo sapiens chromosome 19, BAC OT-HSPC to4o-hMo 

„ 424318 M476515 Hs.172723 ESTs to-to-hMo 

10 455653 BE154075 gb:PMO-HT0339.200400^010-E05 HT0339 Homo to-to-hMo 

451493 H38656 Hs.32854 ESTs lo-lo-hMo 

457015 AA688058 Hs.261544 ESTs to-to-hMo 

403654 NM__003071:Homo sapiens SWI/SNF related, lo-lo-hi-lo 

t _, 435203 AW957127 Hs.294027 ESTs lo-to-hMo 

15 409322 BE091159 Hs.22687 ESTs, Moderately similar to unnamed prot lo-lo-hi-lo 

437764 AA767795 Hs.166832 ESTs to-to-hMo 

432542 AW083920 Hs.16098 claudin2 lo-lo-hi-lo 

436125 AA765895 Hs.152895 ESTs lo-to-hMo 

403217 AL134878 ribosomal protein, large P2 to-to-hMo 
20 434023 AI277883 Hs.146141 ESTs to-to-hi-to 

442419 AI749893 Hs.270532 ESTs, Weakly similar to 138022 hypothec* lo-lo-hi-lo 

443667 AI129066 Hs.135467 ESTs b-lo-hi-lo 

451445 AA017609 Hs.343449 gb:ze37e01.rl Soares retina N2b4HR Homo to-to-hMo 

454775 BE160229 gb:QV1-HT0413-090200462-a12 HT0413 Homo lo-lo-hi-lo 

25 411053 AW815061 gb:CM0-ST02Q9-271099-082-d!0 ST0209 Homo to-to-hMo 

435312 AJ243396 Hs.4865 voltage-gated sodium channel beta-3 subu lo-lo-hMo 

450875 AK000724 Hs.301553 karyopherin alpha 6 (importin alpha 7) lo-lo-hi-lo 

451180 H61899 Hs.171937 steroid dehydrogenase-like . to-to-hMo 

0 . 427327 AW501456 Hs.288283 Homo sapiens cDNA: FU 22355 fis, clone H lo-lo-hi-lo 

30 444321 AW204210 Hs.122275 Homo sapiens mRNA; cONA DKFZp564N1623 (f lo-to-hMo 

405109 N47812 CGI-35 protein lo-to-hMo 

450182 AI796400 Hs.240767 Human DNA sequence from clone RP1-12G14 lo-lo-hi-lo 

424990 AU076896 Hs.1 54095 zinc finger protein 143 (clone pHZ-1) lo-lo-hi-lo 

428997 AF085391 Hs.194718 zinc finger protein 265 Io-Io-hi-Jo 

35 402602 NM_021186*:Homo sapiens zona peilucidag lo-ta-hMo 

428772 A1524039 Hs.1 92524 ESTs lo-lo-hi-lo 

423759 AI142358 Hs.184361 ESTs, Moderately simflar to ALU7_HUMAN A lo-to-hMo 

434350 AL042940 Hs.93872 KIAA1682 protein IcMo-hMo 

442274 AI733484 Hs.129182 ESTs to-to*Mo 

40 442884 AI076570 Hs.134053 ESTs to-to-hMo 

400481 Target Exon to-lo-hMo 

407283 T51008 gb:yb55e08.s1 Stratagene ovary (937217) lo-lo-hi-lo 

408859 AW291672 Hs.258981 ESTs to-to-hMo 

Ar 455615 BE045344 Hs.274923 ESTs, Moderately similar to unnamed prot lo-lo-hi-lo 

45 427315 AA179949 Hs.175563 Homo sapiens mRNA; cDNA OKFZp564N0763 (f lo-lo-hi-lo 

449375 R07114 Hs.271224 ESTs lo-lo-hi-lo 

419937 AB040959 Hs.93836 DKFZP434N014 protein to-kMiMo 

422231 AA443512 Hs.101383 ESTs to-toW-to 

437210 AA311443 Hs.293563 Homo sapiens mRNA; cDNA DKFZp586E2317 (f lo-to-hMo 

50 418056 AA524886 gb:nh34f02.s1 NO.CGAP_Pr3 Homo sapiens lo-lo-hi-lo 

446586 N58790 Hs.268820 ESTs lo-lo-hi-lo 

407949 W21874 Hs.247057 ESTs, Weakly similar to 2109260A B cell lo-lo-hMo 

440296 030829 Hs.1 80610 spficing factor proline/glutamine rich { lo-lo-hMo 

422260 AA315993 Hs.1 05484 regenerating gene type IV lo-lo-hi-lo 

55 434685 AA642445 Hs.287467 Homo sapiens cDNA FU11948 fis, clone HE lo-to-hMo 

412657 AW976165 gb:EST3B8274 MAGE resequences, MAGN Homo to-lo-hi-lo 

405188 Target Exon lo-lo-hi-lo 

416954 AI222358 gb:qh04c1 2x1 Soares_NFljr_GBCLS1 Homo s lo-to-hi-lo 

423700 AA232375 Hs.58606 SNRPN upstream reading frame lo-to-hi-lo 

60 430288 BE394943 Hs.13804 hypothefical protein dJ462023.2 lo-lo-hMo 

435184 T67162 Hs.135127 ESTs, Weakly similar to unnamed protein lo-lo-hMo 

431475 AI567669 Hs.40342 putative nuclear protein lo-to-hMo 

445239 AI217375 Hs.170023 ESTs, Weakly similar to CA36 HUMAN COLLA Ic-lo-hMo 

436151 AK000801 Hs.324271 Homo sapiens cDNA FU20794 hs, clone CO lo-to-hMo 

65 448489 AI523875 gb:tg97d04 jc1 NCLCGAP CLL1 Homo sapiens to-lo-hi-lo 

424470 BE244261 Hs.323502 Homo sapiens cDNA: FU23539 fis, clone L lo-to-hi-lo 

434733 AI334367 Hs.159337 ESTs lo-to-hi-lo 

409469 AW517236 Hs.335762 ESTs to-to-hMo 

414034 U89277 Hs.305985 early development regulator 1 (homolog 0 lo-lo-hMo 

70 420382 AW959165 Hs.270034 Homo sapiens. Similar to nuclear localiz lo-k*hMo 

430433 AA478883 Hs.273766 ESTs to4o-hMo 

435351 T80177 Hs.1 18064 similar to rat nuclear ubiquitous casein lo-lo-hi-lo 

403218 AL1 34878 ribosomal protein, large P2 lo-to-hi-lo 
420678 AW593288 Hs.3530 TLS-associated serme-arginirte protein 2 lo-to-hMo 

75 445808 AV655234 ESTs, Moderately similar to PC4259 ferri to-kMiMo 

429933 AA765596 Hs.187691 ESTs lo-toW-to 

419802 AA250950 Hs.154334 ESTs to-Io-hWo 

425155 W26522 Hs.75890 gb:32g2 Human retina cDNA randomly prime kMo-hMo 

417314 N68168 gb^a11c01.s1 Soares fetal Gver spleen to-lo-hi-lo 

80 428290 AI932995 Hs.1 83475 Homo sapiens clone 25061 mRNA sequence to-lo-hi-lo 

422128 AW881145 gkQV0-OT0033^1040O-182-a07 OT0033 Homo io-io-hMo 

432014 H66741 HsJ38540 ESTs, Weakly simitar to AlU4_HUMAN ALU S lo-to-hMo 
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407351 AW383165 gb:PM3-HT0344-1512994KM-f07 HT0344Homo lo-lo-hi-to 

443231 W87548 Hs.1 32932 ESTs lo-lo-hi-lo 

444001 AI095087 Hs.152299 ESTs, Moderately similar to S65657 alpha lo-lo-hi-lo 

435084 T70740 Hs.31433 ESTs lo-lo-hMo 

5 435173 AW295645 Hs.255451 ESTs lo-Io-hi-lo 

411831 AW994394 gb:RC3^N0036-050400-0!4-h12BN0036 Homo lo-lo-hi-to 

446572 AV659151 Hs.282961 ESTs toJo-hi-fo 

428114 A1821548 Hs.98363 ESTs, Weakly similar to I38022 hypotheti lo-to-hMo 

406207 Target Exon lo-to-hi-to 

1 0 40501 1 Target Exon lo-lo-hi-lo 

409451 AF012626 Hs.54472 fragile X mental retardation 2 to-Jo-hWo 

411233 AW833793 gb:QV4-TT0008.13010(W)80-a06TT0008Homo to-to-hi-to 

455729 BE072092 gb:PM4-BT0532-16020Q-003^>11 8T0532 Homo lo-lo-hi-lo 

439454 AA836120 Hs.258958 ESTs loJo-hMo 

15 445124 AJ806403 Hs.143942 ESTs lo-lo-hMo 

410324 AW292539 Hs.30177 ESTs lo-to-hMo 

446548 AI769392 Hs.200215 ESTs lo-io-hi-lo 

416999 AW195747 Hs.21122 hypothetical protein FU 11 830 similar to lo-Io-hi-lo 

„ 414553 AI813865 Hs.164478 hypothetical protein FU21 939 similar to lo-to-hi-to 

20 444647 H14718 Hs.11506 Human clone 23589 mRNA sequence lo-lo-hMo 

418271 NM_000919 Hs.83920 peptidytglycine alpha-arrudaiing monooxyg lo-lo-hi-to 

407939 W05608 Hs.312679 ESTs, Weakly similar to A49019 dynein he lo-to-hMo 

432676 AI187366 gb:qf29c01 jc1 Soares_tesfo_NHT Homo sap lo-to-hi-to 

415156 X84908 Hs.78060 phosphorylase kinase, beta lo-to-hi-to 

25 432679 AI146956 Hs.146723 ESTs, Weakly similar to A53950 transcrip- to-to-hi-to 

412121 AB033061 Hs.73287 KIAA1 235 protein lo-to-hi-to 

418858 AW961605 Hs.21145 hypothetical protein RG083M05.2 to-to-hMo 

425204 NMJJ02436 Hs.1861 membrane protein, palmitoylated 1 (55kD) lo-lo-hi-to 

418348 A1537167 Hs.96322 hypothetical protein FU 23560 Io4o-hi-lo 

30 410765 AI694972 Hs.66180 nucleosome assembly protein 1 -like 2 lo-lo-hMo 

445594 AW058463 Hs. 12940 zinc-fingers and homeoboxes 1 lo-lo-hMo 

416503 H98502 Hs.269853 ESTs lo-Io-hi-lo 

426167 AF039023 Hs. 167496 RAN binding protein 6 lo-lo-hi-lo 

451752 AB032997 Hs.26966 WAA1 171 protein Io-lo4ii-lo 

35 447124 AW976438 Hs.17428 RBP1 -like protein lo-to-hMo 

419872 AM22951 Hs.146162 ESTs lo-to-hi-to 

443161 AI038316 gb:ox48c08jc1 Soares totaljetus Nb2HF8_ lo-lo-hMo 

445391 T92576 Hs.191168 ESTs lo-Io-hi-lo 

443801 AW206942 Hs.253594 intronof: trichominophalangeal syndro lo-lo-hMo 

40 446706 AW807631 Hs.190488 Homo sapiens. Similar to nuclear tocafiz lo-lo-hi-lo 

428172 U09367 Hs.1 82828 zinc finger protein 136 (clone pHZ-20) lo-lo-hMo 

421021 AA808018 Hs.109302 ESTs lo-lo-hi-lo 

431749 AL049263 Hs.306292 Homo sapiens mRNA; cDNA DKFZp564F1 33 {fr to-to-hMo 

423784 AK000039 Hs.132826 Homo sapiens cDNA FU 1491 3 fis, clone PL lo-to-hi-lo 

45 419479 AI288348 Hs.23450 mitochondrial ribosomal protein S25 lo-to-hMo 

450900 H61005 Hs.37902 ESTs lo-lo-hi-to 

423396 A1382555 Hs.1 27950 bromodomain-containing 1 to-to-hi-lo 

426137 AL040683 Hs.167031 DKFZP566D133 protein lo-lo-hi-lo 

442012 AI733277 Hs.128321 ESTs lo-lo-hMo 

50 452271 AA025976 Hs.34569 ESTs lo-lo-hMo 

414882 079994 Hs .77546 Homo sapiens cONA: FU21983 fis, clone H lo-lo-hi-lo 

432195 AJ243669 Hs.8127 KIAA0144 gene product lo-lo-hi-lo 

430217 N47863 Hs.1 80450 ribosomal protein S24 lo-lo-hi-lo 

429567 R35606 Hs.326800 Human EST clone 53125 mariner transposon lo-lo-hi-to 

55 438810 AW897846 Hs.6421 hypothetical protein DKFZp761N091 21 to-to-hMo 

436796 BE515260 Hs.5320 hypothetical protein lo-to-hi-to 

426352 N72324 Hs.55098 ESTs to-to-hMo 

415308 F05251 gb:HSC04H101 normalized infant brain cDN lo-Jo-tiWo 

420148 U34227 Hs.95361 myosin VllA (Usher syndrome 1B(autosoma to-to-hMo 

60 434442 AA737415 Hs.1 52826 ESTs to-to-hi-lo 

449429 AA054224 Hs.59847 ESTs to-to-hi-lo 

410245 C17908 Hs.194125 ESTs toto-hMo 

421168 AF182277 Hs.330780 cytochrome P450, subfemiry llB (phenobar toto-hMo 

436237 R11528 Hs.271968 ESTs fcMoW-to 

65 440668 AI989538 Hs.191074 ESTs kMo-hMo 

422068 AI807519 Hs.104520 Homo sapiens cONA FU 13694 fis, clone PL lo-feMiMo 

410216 BE061839 gb:RC1 -BT0254-2901 00-01 5-a05 BT0254 Homo lo-to-hWo 

439437 A1207788 Hs.343628 sialyltransferase 4B (beta^alactosidase to-Io4iMo 

417061 AI675944 Hs.188691 Homo sapiens cDNA FU 12033 fis, clone HE lo-lo-hMo 

70 403046 NMJ05656*:Homosapiertttrar^ernbrai>epr to-to-hMo 

404528 AI912555 peptide YY, 2 (seminalplasmin) to-to-hi-lo 

439734 AC005013 Hs.149 cAMP response element-binding protein CR kMo-hMo 

452997 N64777 Hs.44656 ESTs toto-hMo 

403745 ENSP00000226812*:WAA1494 protein (Fragm lo-lo-hi-lo 

75 411448 AA178955 Hs.271439 ESTs. Weakly similar to I38022 hypotheti to-to-hMo 

422460 AW445014 Hs.197746 ESTs to-to-hi-to 

404058 Target Exon lo-lo-hi-to 

436184 BE154067 Hs.136660 ESTs, Weakly similar to ZN91J-HJMAN ZINC lo-lo-hi-lo 

427702 N76589 Hs.14454 ESTs, WeaWy similar to TF1ID suburutTA to4o-hMo 

80 440695 AW088363 Hs.246240 ESTs t(Mo-hMo 

424881 AL1 19690 Hs.1 53618 HCGVM-1 protein to-lo-W-hi 

440573 BE550891 Hs.270624 ESTs lo-to-hMii 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



416659 
436731 
405102 
450219 
404527 
439158 
431952 
418584 
424241 
410124 



W22048 Hs.64753 
AA580S91 Hs.180789 



424001 
441399 
440184 
421996 
444252 
402082 
405396 
412457 
415808 
441494 
437330 
452784 
410037 
449145 
452487 
431031 
427209 
434280 
418236 
429201 
416653 
422501 
425087 
426798 
443798 
427254 
431657 



446006 
418259 
410173 
436023 
448428 
430665 
432559 
451572 
456032 
438209 
438337 
431795 
421114 
431843 
440948 
430105 
439046 
451491 
452789 
419829 
449567 
407787 
409091 
435354 
444809 
422170 
453582 
435905 
443884 
430027 
432582 
417993 
444930 
427794 
410913 
431992 
447846 
430439 
432621 
431427 
408872 
453200 
411529 



A1826999 

AI912555 

R60323 

Z70695 

NM.004606 

AW995948 

AW962229 

AA830515 

W678B3 

AI630844 

AB002297 

AW583807 

R21135 



T32587 

R21439 

AW452344 

AL353944 

BE463857 

AB020725 

AI632122 

AW207659 

AA830335 

H06509 

BE005398 

AW994005 

X03178 

AA76B553 

AA354690 

R62424 

AA385062 

R07848 

AL121523 

A1345227 

AA133590 

NM.004403 

AA215404 

AA706017 

T81819 

AF282874 

BE350122 

AW452948 

M018556 

AW957446 

AL120659 

AK002056 

AK002088 

AW975051 

AA516420 

AW188311 

X70297 

AA947354 

AI972094 

AW081626 

A1924228 

AI990790 

N21307 

AW970386 

AA678267 

BE20756B 

AI791949 

AW854339 

AW997484 

N20617 

AB023197 

AI623817 

AW963705 

BE185536 

AA709186 

AL05O367 

NM.002742 

AA324057 

AL133561 

A1298501 

AKOOO401 

A/476139 

AA033832 

AA430348 



Hs.224624 

Hs.193888 

Hs.272240 

Hs.1179 

Hs.182339 

Hs.128927 

Hs.222917 

Hs.137476 

Hs.126919 

Hs.7022 

Hs.1460 

Hs.54985 



Hs.170414 

Hs.334578 

Hs.129977 

Hs.50115 

Hs.151258 

Hs.58009 

Hs.198408 

Hs.6630 

Hs.105273 

Hs.92423 

Hs.337534 

Hs.198246 

Hs.193145 

Hs.144967 

Hs.126059 

Hs.130260 

Hs.188522 

Hs.97774 

Hs.105448 

Hs.250857 

Hs.13530 

Hs. 11 9944 

Hs.302251 

Hs.21201 

Hs.157367 

Hs.257631 

Hs.268691 

Hs.301711 

Hs.6111 

Hs.6166 

Hs.270124 

Hs.293156 

Hs,128619 
Hs.2540 

Hs.286221 

Hs.242561 

Hs.1 15185 

Hs.188614 

Hs.13477 

Hs,269423 

Hs.1 17115 

Hs.208219 

Hs.1 12432 

Ks.33476 

Ks.5003 

Hs.194397 

Hs.227743 

Hs.168457 

Hs.301183 

Hs.301183 

Hs.99070 

Hs.66762 

Hs.2891 

Hs.77955 

Hs.1 2807 

Hs^52748 

Hs.13291 

Hs^12433 

Hs^17596 



gb:61A12 Human retina cDNA Tsp509l-deav 
S164 protein 

C15001220*:gi|4469558lgbJAAD21311.1| (AF 
ESTs 

pepOde YY, 2 (seminalpJasmtn) 
ESTs 

Homo sapiens cDNA FU1 1086 lis, done PL 
TATA box binding protein (TBP)-associate 
Homo sapiens pyruvate dehydrogenase kina 
Homo sapiens cDNA FU13903 fis, done TH 
ESTs 

paternally expressed 10 
ESTs 

dedicator of cyto^rinesis 3 

glucagon 

ESTs 

C18000743*:gil6678363lreflNP_033416.1| t 
C22000452*:gi|6981522|refjNP„036781.1j r 
paired basic amino add cleaving system 
Homo sapiens, clone IMAGE:3929520, mRNA 
ESTs 

Homo sapiens mRNA; cDNA DKFZp761J11 12 (f 
hypothetical protein FU21062 
WAA0918 protein 
ESTs 

Homo sapiens cDNA FU 13329 fis, clone OV 
ESTs 

WAA1566 protein 

gb:CM1-BN01 16-1 50400-1 89-h02 BN01 16 Homo 
ESTs 

group-spedfic component (vitamin D bind 

metaltolhionein 1E (functional) 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs, Weakly similar to B34087 hypotheti 

cairium/caJmodulfn-dependeni protein Ion 

deafness, autosomal dominant 5 

ESTs 

ESTs 

ESTs 

necfin 3; DKFZP566B0846 protein 
ESTs, Weakly similar to I7B885 serineAh 
ESTs 

ESTs, Moderately similar to ALU2_HUMAN A 
ESTs 

aryWiydrocarbon receptor nuclear transl 

hypothetical protein FU11196 

Homo sapiens cDNA FU11226 fis, done PL 

ESTs, Weakly similar to 178885 serine/th 

ESTs, Weakly similar to 138022 hypotheti 

ESTs 

cholinergic receptor, nicotinic alpha p 
gb:od86e1 1.s1 NCLCGAP_Ov2 Homo sapiens 
Homo sapiens cDNA FU13741 As. clone PL 
ESTs 

ESTs, Moderately similar to PC4259 ferrt 
ESTs 

ESTs, WeaWy similar to 1207289A reverse 

ESTs 

ESTs 

ocutospanin 
anti-Muilerian hormone 
hypolheticai protein FU11937 
KIAA0456 protein 



K1AA0980 protein 
ESTs 

molecule possessing ankyrin repeals indu 
molecule possessing ankyrin repeats indu 
ESTs 

Homo sapiens mRNA; cDNA DKF2p564A026 (fr 
protein kinase C, mu 

Homo sapiens cDNA: FU 23527 fis, done L 

DKFZP434BG61 protein 

ESTs, Weakly similar to T46428 hypotheti 

Homo sapiens cOMA FU20394 fis, done KA 

ESTs 

ESTs 

Homo sapiens cDNA FU12927 fis, done NT 



kMo-hi-M 
to-to-hi-hj 
lo-kMti-hi 
to4c>hi-hi 
b-lo-hi-hi 
to-to-hMii 
lo-to-hMii 
lo-to-hi-hi 
to4o-hj-hi 
to-io-hi-ht 

hUn-to-lo 
hi-hMo-lo 
hi-hMo-to 
hMiMo-to 
hi-hMo-to 
hMiMo-to 
hMiMo-to 
hMiWo-lo 
hi-hMo-lo 
hMiUo-lo 
hi-hMo-to 
hi-hMo-lo 
hi-hMo-to 
hMiMo-to 
hi-hMo-lo 
hi-hMo-lo 
hi-hMo-to 
hi-hMo-lo 
hi-hMo-to 
hi-hMo-k) 
rd-hMo-to 
hMtMo-to 
hMiMo-to 
hMiMo-lo 
hMiMo-lo 
hi-hMo-to 
hMiMo-to 
hi-hMo-lo 
ru-hMo-lo 
hi-hMoJo 
hMtMo-to 
h)-hi4o-k) 
hi-hi-kMo 
hi-hMo-to 
hi-hMo-to 
hMtMo-to 
hMiMo-to 
hMtMcMo 
hi-hMo-to 
hi-hi-lo-to 
hi-hMo-to 
hMiMo-to 
hi-hMo-to 
hi-hi-to-io 
hMiMo-to 
hMtMo-to 
hMiMo-to 
hi-hMo-lo 
hi-hMo-to 
hi-hMo-lo 
hi-hMo-to 
hMiMo4o 
hMtMo-to 
hi-hMcMo 
hi-hMo-to 
hi-hMo-to 
hi-hMo-to 
hi-hMo-to 
hi-hMo-to 
hi-hMo-to 
hi-hMo-to 
hi-hMo-to 
hMiMo-to 
hi-hMo-lo 
hMiMo-lo 
hi-hMo-to 
hMiMo4o 
hi-hMo-to 
hMtMo-to 
hMtMo-to 
hi-hMo-to 
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414483 R25513 Hs.10683 ESTs hi-hMo-Io 

451273 NMJM4811 Hs.26163 KIAA0649 gene product hi-hMo-Io 

437052 AA861697 Hs.120591 ESTs hhhMo-lo 

440049 R06699 Hs.19769 hypothetical protein MGC41 74 hi-hMcMo 

5 429483 AA974832 Hs.128708 ESTs hWiMo-lo 

411296 BE207307 Hs.10114 growth suppressor 1 hi-hMo-Io 

425188 AK002052 Hs.155071 hypothetical protein FU 111 90 Iti-hMo-lo 

436315 BE390513 Hs.27935 hypothetical protein MGC4837 hMiMo-lo 

400297 AI127076 Hs.306201 hypothetical protein DKFZp56401 278 hi-hMo-Io 

10 431089 BE041395 ESTs. Weakly similar to unknown protein hMiMo-lo 

418824 AW751661 Hs.53542 choreoacanthocytosis gene; WAA0986 prat hi-hMo-to 

449226 AB002365 Hs.23311 KIAA0367 protein hMiMo-lo 

450149 AW969781 Hs.1 32863 Zrc family member 2 (odd-paired Drosophi hMiWovlo 

418443 NM_005239 Hs.85146 v-ets avian erythroblastosis virus E26 o hMiMo-lo 

15 458692 BE549905 Hs.231754 ESTs hi-hMo-io 

410102 AW248508 Hs.279727 ESTs; homologue of PEM-3 [Ciona savignyi hi-hMo-to 

451062 AL1 10125 Hs.25910 Homo sapiens mRNA; cDNA DKFZp564C1416 (f hMiMo-io 

407633 NMJJ07069 Hs.37189 similar to rat HREV107 hi-hMo-Io 

418941 AA452970 Hs.239527 E1B-55kDa-associated protein 5 hi-hMo-Io 

20 407059 X95406 gb:H.sapiens cydin E gene. hi-hMo-Io 

455956 BE1 62704 gb:PM1-HT0454-301 2994)01 -d08 HT0454 Homo hi-hMo-to 

437763 AA469369 Hs.5831 Ossue inhibitor of metaBoproteinase 1 hi-hi-to-Io 

451404 AA460775 Hs.6295 ESTs, Weakly similar to T1 7248 hypotheti hi-hi-lo-lo 

428494 AA233439 Hs.184634 rjypothetical protein FU20005 hi-hMo-Io 

25 414957 D61283 Hs.45206 ESTs hi-hi-lo-lo 

456415 AI734051 Hs.277102 ESTs, Weakly similar to ALU 1JHJMAN ALU S hi-hMo-io 

400183 Eos Control hi-hMo-Io 

400158 ENSP00000244302*:CDNA FU1 1591 fis, clon hi-hMo-Io 

403893 ENSP00000237068*:Protocadherinaipha6p hi-hMo-to 

30 423809 A1223833 Hs.1 54483 ESTs hi-hMo-Io 

400170 Eos Control hi-hi-io-lo 

403291 Target Exon hi-hi-lo-lo 

422026 UB0736 Hs.1 10826 trinucleotide repeat containing 9 hi-hMo-Io 

417130 AW276858 Hs.81256 S1 00 calcium-binding protein A4 (calcium hi-hMo-Io 

35 432472 AA548781 Hs.136418 ESTs hi-hMo-Io 

405231 C2001066:gi|10257425|ref|NPJ)3389Z1| CD hi-hi-lo-lo 

400141 Eos Control hi-hMo-Io 

428971 BE278404 Hs.285813 hypothetical protein FU 11 807 hi-hMo-to 

A _ 422390 AW450893 Hs.121830 ESTs, Weakly similar to T42682 hypotheti hi-hMo-lo 

40 425538 BE270918 Hs.164026 Homo sapiens, clone IMAGE:3534875, mRNA, hMiMo-lo 

456972 AI054347 Hs.2017 ribosomal protein L38 hi-hMo-Io 

456622 AF205849 Hs.107740 KruppeWike factor 2 (lung) M-hMo-io 

418515 AI568453 Hs.19487 ESTs, Weakly similar to CN1HJHUMAN CORNI hi-hMo-to 

448439 8E613082 Hs.28229 ARG99 protein hi-hMo-to 

45 445418 AW139377 Hs.127179 cryptic gene hi-hMo-Io 

402559 Z23024 Rho GTPase activating protein 1 hi-hMo-Io 

402575 Z23024 Rho GTPase activating protein 1 hi-hi-lo-lo 

42081 1 AA807544 ESTs. Weakly similar to B34323 GTP-bindi hi-hMo-Io 

446627 AI973016 Hs.15725 hypothetical protein SBBI48 hi-hMo-to 

50 400247 Eos Control hMiMo-lo 

430289 AK001952 Hs.238039 hypothetical protein FU 11 090 hi-hMo-Io 

400133 Eos Control hi-hMo-to 

418816 T29621 Hs.88778 carbonyl reductase 1 hi-hMo-Io 

433579 8E264473 Hs.284297 hypothetical protein from EURO I MAGE 1967 hMiMo-lo 

55 401952 Target Exon hi-hMo-Io 

410349 AW663021 Hs.323445 ESTs, Weakly similar to T2D3_HUMAN TRANS hi-hi-lo-lo 

417558 AF045229 Hs.82280 regulator of G-protein signalling 10 hi-hMo-to 

446851 AW007332 Hs.10450 Homo sapiens cONA: FU22063 fis, clone H hMiMo-lo 

404489 Target Exon hi-hMo-Io 

60 405802 Target Exon hi-hMo-to 

456266 L29073 Hs.198726 cold shock domain protein A hMiMo-lo 

457133 M54968 v-Kkas2 Kirsten rat sarcoma 2 viral on hi-hi-lo-lo 

459330 C16931 gb:C16931 dontech human aorta poryA mRN hi-hMo-Io 

433041 BE265848 Hs.289080 colon cancer-associated protein Mid to-to-Io-hi 

65 446545 AI431798 Hs.164192 ESTs, Wealdy similar to Y161JUJMAN HYPOT to-to-to-hi 

414911 NMJJ00107 Hs.77602 damage-specific DNA binding protein 2 (4 lo-lo-lo-hi 

414682 AL021154 Hs.76884 inhibitor of DNA binding 3, dominant neg lo-to-lo-hi 

422311 AF073515 Hs.114948 cytokine receptor-like factor 1 to-Jo-lo-hi 

447329 BE090517 ESTs, Moderately similar to ALU8_HUMAN A lo-toto-hi 

70 412942 AL120344 Hs.75074 mitogen-activated protein Wnase-activat to-kMo-hi 

420747 BE294407 Hs.99910 phosphofructoxinase, platelet lo-kMo-hi 

431912 AI660552 HsJ6549 ESTs, Weakly similar to A56154 Abl subsl Mo4<yti 

446506 AI123118 Hs.15159 chemokine-fike factor, allemalively spl lo-lo-lo-hi 

408633 AW963372 Hs.46677 PRO2000 protein lo-kMo-hi 

75 433675 AW977653 Hs.75319 ribonucleotide reductase M2 polypeptide hMcMo-hi 

424560 AA158727 Hs.150555 protein predicted by clone 23733 hMo-lo-hi 

425234 AW152225 Hs.165909 ESTs, Weakly similar to 138022 hypotheti hMcMo-hi 

439815 AA206Q79 Hs.6693 hypothetical protein FU20420 hMo4o-hi 

410174 AA306007 Hs.59461 DKFZP434C245 protein hMo-lo-hi 

80 410442 X73424 Hs.63768 propionyj Coenzyme A carboxylase, beta p hMo4o-fti 

429190 H18650 Hs.92602 ESTs hMo-lo-hi 

423619 T48691 Hs^49159 adrenergic, afpha-2A-. receptor hMcMo-hi 
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433764 AW753676 Hs.39982 ESTs hMo-lo-hi 

421998 R74441 Hs.117176 poly(AH)inding protein, nudear 1 hMo4o-hi 

451593 AF151879 Hs.26706 CGt-121 protein hMo-to-hi 

452092 BE245374 Hs.27842 hypothetical protein FU1 1210 hMtMo-hi 

5 447425 AI963747 Hs.18573 acyiphosphatase 1, erythrocyte (common) hMo-to-hi 

421654 AW163267 Hs.106469 suppressor of varl (S.cerevisiae) 3-fflte hMo-lo-hi 

432502 NM_014641 Hs.277585 WAA0170 gene product hMo-lo-hi 

429597 NM_003816 Hs.2442 a dis'mtegrin and metalloproteinase doma hMo-lo-hi 

434203 BE262677 Hs.283558 hypothefica) protein PR01 855 hMo-lo-hi 

10 438461 AW075485 Hs.286049 phosphoserine aminotransferase hMo-lo-hi 

409142 AL1 36877 Hs.50758 SMC4 (structural maintenance of chromoso hMo-lo-hi 

439574 AI469788 Hs.165190 ESTs hM(Mo-hi 

438182 AW342140 Hs.182545 ESTs. Weakly similar to ALU 1 HUMAN ALUS hMo-lo-hi 

449103 T24968 Hs.23038 HSPC071 protein hMo-to-hi 

15 421059 A1654133 Hs.30212 thyroid receptor interacting protein 15 hMo-to-hi 

446939 AL133353 Hs.16606 CGJ-32 protein hMo-lo-hi 

408576 NM_003542 Hs.46423 H4 histone family, member G hMo-to-hi 

410073 AW408163 Hs.58488 catenin (cadherin-associated protein), a hMo-lo-hi 

„ 450912 AW939251 Hs.25647 v-fos FBJ murine osteosarcoma viral onco hMo-to-hi 

20 434701 AA460479 Hs.321707 WAA0742 protein hMo-lo-hi 

450455 AL1 17424 Hs.25035 chloride intracellular channel 4 hMo-to-hi 

451144 AW956103 Hs.61712 pyruvate dehydrogenase kinase, isoenzyme hMo-lo-hi 

427390 AI432163 Hs.268231 Homo sapiens cDNA: FU23111 fis. cfone L hMo-lo-hi 

_ _ 451831 NM.001674 Hs.460 activating transcription factor 3 hMo-lo-hi 

25 406776 T16206 Hs.237164 ESTs, Highly similar to LDHH HUMAN L-LAC hMo-to-hi 

428157 AI738719 Hs.198427 hexokinase2 hMo-lo-hi 

408096 BE250162 Hs.83765 dihydrofolate reductase hMo-lo-hi 

418203 X54942 Hs.83758 CDC28 protein kinase 2 hMo-lo-hi 

_ ^ 449338 H73444 Hs.394 adrenomedullin hMo-lo-hi 

30 422082 AA016188 Hs.111244 hypothetical protein hMo-lo-hi 

407907 AI752235 Hs.41270 procollagen-lysine, 2-oxog!utarate 5-dio hMo-lo-hi 

416655 AW968613 Hs.79426 BCL2/adenovirus E1B 19kD-interacting pro hMo-lo-hi 

419551 AW582256 Hs.91011 anterior gradient 2 (Xenepus laevis) horn hMo-to-hi 

434094 AA305599 Hs.238205 hypothetical protein PR0201 3 hMo-lo-hi 

35 443951 F13272 Hs.1 11334 ferritin, light polypeptide hMo-to-hi 

422975 AA347720 Hs.122669 KIAA0264 protein hi-kMo-hi 

430314 AA369601 Hs.239138 pre-B-cell colony-enhancing factor hMo-lo-hi 

412664 AA421404 Hs.346868 nucleolar protein p40; homolog of yeast hMo-lo-hi 

408089 H59799 Hs.42644 thioredoxin-Bke hi-kMo-hi 

40 409690 W45393 Hs.55888 activating transcription factor 7 hMo-lo-hi 

442332 AI693251 Hs.8248 Target CAT hMo-lo-hi 

408388 AF091086 Hs.44563 hypothetical protein hMo-lo-hi 

441252 AW360901 Hs.183047 hypothetical protein MGC4399 hMo-lo-hi 

433069 X76732 Hs.3164 nucleobindin 2 hMtMo-hi 

45 443837 AI984625 Hs.9884 spindle pole body protein hMo-to-hi 

426108 AA622037 Hs.166468 programmed cell death 5 hMo-lo-hi 

441181 AA416925 Hs.121076 peptidylprdyl isomerase (cyclophifinH hMo-lo-hi 

447397 BE247676 Hs.18442 E-1 enzyme hMo-lo-hi 

427505 AA361562 Hs.178761 26S proteasome-assoaated pad1 homotog hMo-lo-hi 

50 430287 AW182459 Hs.125759 ESTs, Weakly similar to LEU5.HUMAN LEUKE hMo-lo-hi 

415857 AA866115 Hs.127797 Homo sapiens cDNAFU 11 381 fis, clone HE hMo-lo-hi 

423198 M81933 Hs.1634 cell division cycle 25A hMo-lo-hi 

407687 AKQ02011 Hs.37558 hypothetical protein FU1 1149 hMo-lo-hi 

__ 431374 BE258532 Hs.251871 CTP synthase hMo-lo-hi 

55 413273 U75679 Hs.75257 stem-loop (histone) birkfing protein hMo-lo-hi 

442799 AI564739 Hs.68505 ESTs hMo-lo-hi 

443881 R64512 Hs.237146 hypothetical protein FU 12752 hMo-lo-hi 

416209 AA236776 Hs79078 MAD2 (mitotic arrest deficient, yeast h hMo-lo-hi 

421834 BE543205 Hs.288771 DKFZP586A0522 protein hMo-lo-hi 

60 411263 BE297802 Hs.69360 WnesirMike 6 (mitotic centromere-assoc hMo-to-hi 

413924 AL1 19964 Hs.75616 seladin-1 hMo-lo-hi 

450598 AF151076 Hs.25199 hypothetical protein hMo-lo-hi 

439453 BE264974 Hs.6566 thyroid hormone receptor interactor 13 hMo-lo-hi 

429612 AF062649 Hs.252587 pituitary tumor-transforming 1 hMo-lo-hi 

65 443426 AF098158 Hs.9329 chromosome 20 open reading frame 1 hMo-lo-hi 

452353 C18825 Hs.29191 epithelial membrane protein 2 hi-lc-io-hi 

419879 Z17805 Hs.93564 Homer, neuronal immediate early gene, 2 hMo-lo-hi 

422363 T55979 Hs.115474 repRcation tactor C (activator 1) 3(38 hMo-to-hi 

416065 BE267931 HsJ8996 proHferating cell nuclear antigen hMo-lo-hi 

70 424308 AW975531 Hs.1 54443 minichromosome maintenance deficient (S. hMo-lo-hi 

447519 U46258 Hs.339665 ESTs hMo-lo-hi 

437679 NWL014214 Hs.5753 inositDl{myo)-1 (or 4)wnonophosphalase 2 hMo-lo-hi 

446636 AC002563 Hs.15767 citron (rho-interacting, serine/lhreonin hMo-lo-hi 

422094 AF129535 Hs.272027 F-box only protein 5 hMcMo-hi 

75 440334 BE276112 Hs7165 zinc finger protein 259 hMo-to-hi 

421921 H83363 Hs.6820 translocase of inner mitochondrial membr hMo-to-hi 

422938 NM_001809 Hs.1594 centromere protein A (17kD) hi-kMo-hi 

427719 AI393122 Hs.134726 ESTs hMo-lcMii 

422283 AW411307 Hs.114311 COC45 (cell division cycle 45, S.cerevts hMo-lo-hi 

©0 424840 079987 Hs.153479 extra spindle poles, S, cerevisiae, homo hMo-lo-hi 

418216 AA662240 Hs.283099 AF15q 14 protein hMo-lo-hi 

412140 AA219691 HsJ3625 RAB6 interacting, kinesin-like (rabkmes hMo-toW 
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418322 AA284166 Hs.84113 cycfin-dependent kinase inhibitor 3 (CDK hMo-lo-hi 

428479 Y00272 Hs.334562 ceO division cycle 2, Gl to S and G2 to hMo-to4u 

449722 BE280074 Hs.23960 cycfinBI hMo-to-hi 

_ 417933 X02308 Hs.82962 thymidytate synthetase hHo4o4ii 

5 433001 AF217513 Hs.279905 clone HQ0310 PRO0310p1 hMo-to-hi 

413943 AW294416 Hs.144687 Homo sapiens cDNA FU 12981 fis, clone NT hMo-to-hi 

424905 NMJJ02497 Hs. 153704 NiMA (never in mitosis gene a)-related k hMo-to-hi 

422765 AW409701 Hs.1578 baculoviral IAP repeat-containing 5 (sur hMo-to-hi 

425397 J04088 Hs.156346 topoisomerase (DNA) il alpha (170XD) hMo-to-hi 

10 444371 BE540274 Hs.239 forkhead box Ml hMo-lo-hi 

422956 BE545072 Hs.122579 ECT2 protein (Epithelial cell transform/ hMo-lo-hi 

444783 AK001468 Hs.62180 anillin (Drosophila Scraps homolog), act hMo-to-hi 

453884 AA355925 Hs.36232 WAA01 86 gene product hMo-to-hi 

416980 AA381133 Hs.80684 high-mobility group (nonhistone chromoso hMo-lo-hi 

15 442432 BE093589 Hs.38178 hypothetical protein FU23468 hMo-lo-hi 

417308 H60720 Hs.81892 KIAA01 01 gene product hMo-lo-hi 

433133 AB027249 Hs.104741 PDZ-binding kinase; T-ceil originated pr hMo-lo-hi 

432626 AA471098 Hs.278544 acetyl-Coenzyme A acetyltransferase 2 (a hMo-to-hi 

^_ 441020 W79283 Hs.35962 ESTs hMo-lo-hi 

20 412281 AI810054 Hs.14119 ESTs hMo-to-hi 

435602 AF217515 Hs.283532 uncharacterized bone marrow protein BM03 hMo-lo-hi 

400882 Target Exon hi-to-to-hi 

446269 AW263155 Hs.14559 hypothetical protein FU 10540 hMo-lo-hi 

0 _ 417847 AI521558 Hs.7331 hypotheticai protein FU223 16 hMo-lo-hi 

25 400881 NM_025080:Homo sapiens hypothetical prat hMo-lo-hi 

419356 AI656166 Hs.7331 hypothetical protein FU22316 hMo-to-hi 

400292 AA250737 Hs.72472 BMP-R1B hMo-lo-hi 

415539 AI733881 Hs.72472 BMP-R1B hMo-to-hi 

„ 453935 AI633770 Hs.42572 ESTs hMo-lo-hi 

30 420005 AW271106 Hs.133294 ESTs hMo-lo-hi 

428450 NMJ)14791 Hs.184339 KIAA0175 gene product hi-lo-to-hi 

436291 BE568452 Hs.344037 protein regulator of cytokinesis 1 hMo-to-hi 

441362 BE614410 Hs.23044 RAD51 (S. cerevisiae) homolog (E coli Re hMo-to-hi 

428484 AF104032 Hs.1 84601 solute carrier family 7 (caOonic amino . hMo-lo-hi 

35 418526 BE019020 Hs.85838 solute carrier family 16 (monocarboxyfic hMo-lo-hi 

458809 AW972512 Hs.20985 sin3-associated polypeptide, 30kD hMo-lo-hi 

444984 H15474 Hs.132898 fatty acid desaturase 1 hMo-lo-hi 

447342 AI199268 Hs.19322 Homo sapiens. Similar to RIKEN cDNA 2010 hMiMo-to 

428330 L22524 Hs.2256 matrix metatloproteinase 7 (matrilystn, hMiMo-lo 

40 428336 AA503115 Hs.1 83752 microseminoprotein. beta- hMu-Jo-lo 

430389 AL1 17429 Hs.240845 DKFZP434D146 protein hMiMo-lo 

417318 AW953937 Hs.240845 ESTs hMiMo-lo 

422545 X02761 Hs.287820 fibronecCn 1 hMiMo-to 

. _ 417640 D30857 Hs.82353 protein C receptor, endothelial (EPCR) hMo-lo-io 

45 422809 AK001379 Hs.121028 hypothetical protein FU10549 hMo-lo-hi 

425580 L11144 Hs.1907 galanin hMo-lo-hi 

416836 D54745 Hs.80247 cholecystokinin hMo-lo-hi 

434170 AA626509 Hs.122329 ESTs hMo-to-hi 

427958 AA418000 Hs.98280 potassium intermediate/small conductance hMo-lo-hi 

50 439706 AW872527 Hs.59761 ESTs, Weakly similar to DAP1J-IUMAN DEATH hMo-lo-hi 

450088 AW292933 Hs.254110 ESTs hMo-to-hi 

414219 W20010 Hs.75823 ALL 1 -fused gene from chromosome 1q hMo-lo-hi 

419201 M22324 Hs.1239 alanyl (membrane) aminopeptidase (aminop hMo-to-hi 

426263 AI908774 Hs.259785 c^ffine palmitoyltransferase I, Over hMo-to-W 

55 456236 AF045229 Hs.82280 regulator of G-protein signalfing 10 hMo-to-hi 

456607 AI660190 Hs. 106070 cycfin-dependent kinase inhibitor 1C (p5 hMo-lo-hi 

408437 AW957744 Hs.278469 lacrimal proline rich protein hMo-to-hi 

421180 BE410992 Hs.258730 heme-regulated initiation factor 2-alpha hMo-to-hi 

413437 BE313164 Hs.75361 gene from NF2/meningioma region of 22q 12 hMo-Jo-hi 

60 432415 T16971 Hs.289014 ESTs, Weakly similar to A43932 mucin 2 p hMo-to-hi 

449230 BE613348 Hs.211579 melanoma cell adhesion molecule hMo-to-hi 

417979 AU077284 Hs.83081 GTP cydohydrolase i feedback regulatory hMo-lo-hi 

421877 AW250380 Hs. 109059 mitochondrial ribosomai protein L1 2 hMo-to-hi 

412482 A1499930 Hs.334885 mitochondrial GTP binding protein hMo-lo-hi 

65 428423 AU076517 Hs.1 84276 solute carrier family 9 (sodium/hydrogen hMo-lo-hi 

422947 AA306782 Hs.1 22552 G-2 and S-phase expressed 1 hMo-to-hi 

441072 AW275480 Hs.39504 hypothetical protein MGC4308 hi-lo-lo-hi 

415938 BE383507 Hs.78921 A kinase (PRKA) anchor protein 1 hMo-Iohi 

432278 AL137506 Hs.274256 hypothetical protein FU23563 hMo-lo-hi 

70 446651 AA393907 Hs.97179 ESTs hMo-lo-hi 

431515 NM.012152 Hs.258583 endothelial differentiation, lysophospha hMo-to-hi 

445345 AW0038SO Hs.1 2532 chromosome 1 open reading frame 21 hMo-to-hi 

458965 AA010319 Hs.60389 ESTs hMo-to-hi 

438321 AA576635 Hs.6153 CGW8 protein hMo4o-hi 

75 416783 AA206186 Hs.79889 monocyte to macrophage differenBation-a hMo-lo-hi 

453563 AW608906 Hs.181163 hypothetical protein MGC5629 hMo-lo-W 

432393 AW205863 Hs.1 33988 hypothetical protein FKSG28 hMo-lo-hi 

433914 AF108138 Hs.112160 Homo sapiens DNA helicase homolog (PIF1) hMo4cMii 

n _ 414907 X90725 Hs,77597 polo (DrosophiaHke kinase hMo-to-hi 

80 432375 BE536069 Hs.2962 S1 00 caJdum-Wnding protein P hMo-lo-hi 

440773 AA3527Q2 Hs.37747 Homo sapiens, Similar to RIKEN cDNA 2700 hMo-to-hi 

415994 NM_002923 Hs.78944 regulator of G^rotein signalfing 2, 24k hMo-lo-hi 
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412722 A1343300 Hs.15091 ESTs 

446839 BE091926 Hs. 16244 mitotic spindle coiled-coa related prol 

428862 NM_000346 Hs.2316 SRY (sex determining region Y)-box 9 (ca 

439108 AW163034 Hs.6467 synaptogyrin 3 

5 430178 AW449612 Hs.152475 ESTs 

421733 AL119571 Hs.1420 fibroblast growth factor receptor 3 (ach 

452410 AL133619 Homo sapiens mRNA; cONA DKFZp434E2321 (i 

430132 AA204686 Hs.234149 hypothetical protein RJ20647 

428297 AA238291 Hs. 183583 serine (or cysteine) proteinase inhibito 

lO 413142 M81740 Hs.75212 ornithine decarboxylase 1 

427239 BE270447 Hs.174070 ubiquitm carrier protein 

409738 BE222975 Hs.56205 insulin induced gene 1 

410746 BE383816 Hs.12532 chromosome 1 open reading frame 21 

424506 AF220490 Hs. 149623 group III secreted phosphdipase A2 

15 447333 BE090580 Hs.70704 hypothetical protein dJ616B8.3 

414761 AU077228 Hs.77256 enhancer of zeste (DrosophBa) homotog 2 

419602 AW248434 Hs.91521 hypothetical protein 

411669 BE612676 Hs.303116 stromal celWerived factor 2-like 1 

452322 BE566343 Hs.28988 glutaredoxin (thioltransferase) 

20 426006 R49031 Hs.22627 ESTs 

457465 AW301344 Hs. 122908 DNA repDcation factor 

406867 AA157857 Hs.1 82265 keratin 19 

407230 AA157857 Hs.182265 keratin 19 

446681 AJ003624 Hs.15896 kendrin 

25 408493 BE206854 Hs.46039 phosphoglycerate mutase 2 (muscle) 

439186 AI697274 Hs.1 05435 GDP-mannose 4,6-dehydratase 

424544 M88700 Hs.1 50403 dopa decarboxylase {aromatic L-amino aci 

431325 AW026751 Hs.5794 ESTs, Weakly similar to 210926QA B cell 

414922 D00723 Hs.77631 glycine cleavage system protein H (amino 

30 438291 BE514605 Hs.289092 Homo sapiens cDNA: FU22380 fis, clone H 

418574 N28754 M-phase phosphoprotein 9 

409342 AU077058 Hs.54089 BRCA1 associated RING domain 1 

432734 AA837396 Hs.263925 US1 -interacting protein NUDE1, rat homo 

436087 BE300296 Hs.5054 CGM 33 protein 

35 420309 AW043637 Hs.21766 ESTs, Weakly similar to ALU5_HUMAN ALU S 

411619 AI418609 Hs.71040 hypothetical protein FU 20425 

424381 AA285249 Hs.146329 protein kinase Chk2 

442547 AA306997 Hs. 2 17484 ESTs, Weakly similar to AUJ1_HUMAN ALU S 

430376 AW292053 Hs.12532 chromosome 1 open reading frame 21 

40 434666 AF151103 Hs.112259 T cell receptor gamma locus 

412330 NMJJ05100 Hs.788 A tdnase (PRKA) anchor protein (gravin) 

452123 AI287615 Hs.38022 ESTs 

424893 AW295112 Hs.153648 Homo sapiens cDNA FU 13303 fis, clone OV 

428057 AI343641 Hs.185798 ESTs 

45 431566 AF176012 Hs.260720 J domain containing protein 1 

439979 AW600291 Hs.6823 hypothetical protein FU 10430 

418836 AI655499 Hs.161712 ESTs 

433757 AI949974 Hs.1 52670 ESTs 

425236 AW067800 Hs.155223 stanniocatem 2 

50 426215 AW963419 Hs.155223 stanniocalcin 2 
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hi-to-lo-hi 
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hMo-to-hi 
hMo-Jo-hi 
hMo-lo-hi 
hMo-lo-hi 
hi-Io-lo-hi 
hMo-lo-hi 
hMo-to-hi 
hMo-lo-hi 
hMo4o4ii 
hMo4o-hi 
N-lo-lo-hi 
hMo-lo-hi 
hMo-to-hi 
hMo-to-W 
hMo-lo-hi 
hMo-lo-hi 
hi-Io-lo-hi 
hi-1o4o-hi 
hMo-lo-hi 
hMo-to-W 
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TABLE 2B 

Pkey: Unique Eos probeset identifier number 
CAT number. Gene cluster number 
Accession: Genbank accession numbers 



10 

15 

20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



Pkey CAT Number 
408660 107294.1 
409051 109699J 



409123 
410216 
410451 
410498 
411053 
411233 
411283 
411701 
411831 
412419 

412492 
412657 
413351 
413509 
413672 
415308 
415516 
416508 
416631 
416954 
417314 
418056 
418259 
418574 

419555 
420811 
421911 
421974 
422128 
423028 
423476 

423895 



425074 
425291 
425980 
426413 
428181 
429163 
429540 
430088 
430103 
430439 
431089 
431843 
432079 
432340 
432676 
433075 



434280 
434609 
435023 
436716 



437576 

438869 
438882 
438980 
439046 
439848 
440151 
440507 



110143J 

1184664 J 

1204118J 

120611.1 

1230446J 

1236369J 

1237666J 

1254466J 

1260400.1 

1293418J 

130082.1 

1318507J 

1363660.1 

1374313J 

1382512J 

1533673 1 

1539185J 

1597894 1 

1605019.1 

163427J 

1666649.1 

171841.1 

173388.1 

17690.1 

185884.1 
196677.1 
208987.1 
209807.1 
211994.1 
224062.1 
22861.1 

233006.1 

241234.1 

246486.1 

249618.1 

258778.1 

266650.1 

287953.1 

300543.1 

305828.1 

312849J 

313089.1 

31808.1 

327825.1 

338324.1 

341 114 J 

345248.1 

352582.2 

35820.1 



382816.1 

38950.1 

398093.1 

425440J 

42814.2 

43892.1 

46651.1 

466649.1 

467544.1 

468133.1 

477806.1 

487109.1 

495677.1 



Accession 

AA525775 AA056342 AI538978 AW975281 M664986 

AA080912 AA075318AA083403 AA076594 AA078992 AA084926 AA081881 M113913 AA1 13892 AA083821 M134801 AA082953 AA070343 

AA062835 AA075419 AA063293 AA071252 AA078900 AA062836 AW974305 

AA063403 AA070823 AA070050 

BE061 839 AW859863 AW606085 

BE065687 BE065637 AW749002 H73690 

AA355749 AA085520 AW966333 AA34031 9 BE170936 

AW815061 H71965 AW815072 AW815048 AW815041 AW815047 BE152831 BE152490 BE149043 BE149075 BE149035 BE149067 
AWB33793 AW833799 AW833346 AW833371 AW833795 AW833562 AW833667 AW833377 
AW852754 AW852897 AW852757 AW852617 BE172755 AW835444 
BE181659 AW890576 AW857638 

AW994394 AW865900 AW865905 AW865891 AW86601 4 AW865898 

AW948630AW948626 AW948634 AW948616 AW948627 AW948615AW948631 AW948605 AW94861 1 AW948610AW948633 AW948623 

AW948628 AW948604 AW948602 AW948607 

AW962604 AA368639 AA1 12257 

AW976165C04000 

BE086815 BE086823 R81218 R69229 

BE145419 BE145433 

BE156536 BE156439 BE156700 BE156449 BE156653 BE156533 BE156524 BE156670 BE156721 BE156723 

F05251 R13748 244028 H14747 

F1 141 1 R15237 243915 H20760 

R39769 T53143 H60012 

H69466 H93884 N59684 

AI222358 N73390 D61648 AA243520 AA1 90953 

N68168 N69188 N90450 

AA524886 AW971347 AA211537 

AA215404 A1990909 BE464132 AW271459 N74332 A1262061 

N28754 N28747 A1568146 AJ979339AA322671 AA322672AW955043AI990326AA776406 AI01 6250 AA84367B A W45 1882 N23 137 N23129 

W70051 AI038748 AA831327 AI925845 AW945895 

AA244416AA244401 

AA807544 AA280648 AI243056 AI022744 AA705288 AA829425 AW452095 AI929317 R19039 AA282024 

AL041520AA300086 

AA301270 AA301379 AA301366 

AW881145 AA490718 M85637 AA304575 T06067 AA331991 
H90946 AA320597 AW954970 BE1 43680 

AL035633 F11794 F11783 H18042 T66089 H29379 R19493 AW134660 A1299437 AL133995 AA057405 N7B357 AA917450 AI002692 T09262 

T65008 H29290 AI200874 AA89441 5 AI732887 AI791768 A1733447 AA988785 N62128 T09261 AW956936 

AA332215AA403110AW965299 

AA343729 AA345779 AA344370 

AA495930 AI470890 H97831 AA350358 BE166712 

AA354572 AW062361 AW813419 AW816041 AI744949 

AA366951 AA470999 AA469425 

AA377823 AW954494 AI022688 

AA423976 AA437075 BE006469 

AA884766 AW974271 AA592975 AA44731 2 

M85776 AA454535 AA456208 H90189 

AA464964 M85405 AA947566 

AA465259 AW897142 AW897144 

AL133561 AL041090 AL1 17481 AL122069 AW439292 AI968826 
BE041395 AA491826 AA621946 M715980 AA666102 

AA516420 C14818 C14815 C15161 C15068 D80763 D60656 AW970134 AA543007 D81004 D60184 AI498371 060382 D601B1 C15876 
AW972746 AA525323 AJ150314 
AA534222 AA632632 T81 234 
AI187366 AA558869 AA618478 

NM.002959 X98248 AA233278 AA846376 AI470560 A1470533 BE327147 AW291971 AA017125 AI1 98417 AI36521 3 AJ 168442 AI33701 8 
AI475049 H85459 AA969895AA888000AA41 8326 AA41 8378 N71981 AL043634AA426361 AA418275 AA232975 AL036861 BE277220 BE387505 
N99710 AW375004 AA418268 AL079651 H85743 AW902319 AW805907 AA984366 T92310 AA405425 AA421732 AI656841 AW300968 
AW593418 T92267 BE464032 AW473548 A1359502 BE552306 AI990196 AW518351 AI239559 AW590963 M018359 AI273737 AL042658 
AA411308AA402810H38111 AW013931 AW366432AW752435 AW376124AI292020 AI292121 AA340647 BE61 3672 BE409874 AA351 915 
BE617026 BE019588 AW402692 AW247466 R59233 AA134761 BE254019 BE265105 D63316 BE313080 BE547713 BE536578 BE546749 
AA324185 H17386 BE253377 R87598 H29072 AA350980 BE076629 BE253957 AA532613 BE252486 AW804459 D30966 R87959 AA091832 
BE005398 AA628622 AA994155 
R76593AF147390 R76594 

AI692552AJ393343AI800510AI377711 F24263AA661876 
A1433540 AA728984 AA804981 

AI821940 N67106 AI744264 AA808846 AA643417 AA64341 6 Z70715 

BE514383 AA071273 AW247987 AW673286 BE312102 AW749824 BE071985 AW577383 BE071945 BE072005 AW577355 BE071965 AW239231 

BE072000 BE071960 AW577360 AW749830 AW373020 X97303 AW999522 BE000192 BE562219 BE266655 BE264970 

AF075009R63109 R63068 

AA827695 AA833754 AW978946 

AW502384 A1982587 AA828822 

AA947354 AA829660 AI687296 

AW979249 D63277 AA846968 

AA868167 F21558 F31418 F35624 

H06994BE147898 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 



441102 
442048 
443161 
444290 
444314 
445808 
447329 

447448 
448150 
448489 
448631 
448738 
452410 



452444 
452654 
454775 
455019 
455272 
455619 
455653 
455729 
455824 
455956 
456123 
457133 



509604J 

531432J 

561305J 

59994 1 

600667J 

65133J 

71759J 

722246.1 

752165J 

765247.1 

772996.1 

77790J 

91 63 J 



918078.1 

925931J 

1234106J 

1249138J 

1271871.1 

1346387.1 

1348742.1 

1353792.1 

1372880.1 

1387163.1 

1534442.1 

29066.1 



457952 44256.1 
458956 83645.1 



AA973905 AI299888 AA917019 H63235 T90771 

AA974603 AI98431 9 AW340495 

AI038316 AI344631 AI261653 

AA262496 AV648929 AA305356 D61644 D78724 

A1140497 AW749625 AW749626 AW749644 

AV655234 AW966332 AA340239 

BE090517 AW970792 AW264490 AW014985 F27436 AA947336 F15843 H89338 AA563626 F1771 2 BE546579 AA421821 AA284852 AA477751 
AW025245 

BE244285 C18429 H42373 AI820706 AI379786 R55439 AW276142 

AI472167AI990315 R32175 

A1523875R45782 R45781 

AI554923AJ902356 

BE614081 W01988 AW500790 

AL133619 AA4681 18 AA383064 AI476447 T09430 A1673758 AA524895 AI581345 AI300820 AW498812 AA256162 AI559724 AI685732 AA602400 
AA905453 A1204595 AW166541 AA157456 M156269 AA383652 AA431072 AW592707 A1435410 AW272464 A1215594 AA622747 R74039 
N35031 AI804128 AW513621 AA868351 AI026826 A1493388 AA614641 W81604 AI567080 A1214351 AA73014O AI125754 AI200813A1269603 
AI565082 AI807095 AI476629 AA505909 AI368449 AI686077 AI582930 AW085038 AA757863 AA7301 54 AI767072 M468316 AI7341 30 A17341 38 
AA426284 M433997 AI741241 AW043563 AI732741 AI732734AA437369 AA425820 AA664048 R74130 
BE144022 BE143969 BE143915 
BE004783 BE004947 AI91 1790 

BE160229 AW819879 AW820179 AW81 9882 AW81 9876 AW820169 BE153201 AW993736 BE152911 

AWB50818 AW850833 AW851 100 

BE148152 BE148133 BE148159 BE148132 AW885107 

BE063853 BE063955 BE063866 BE0637Q5 BE063846 BE061416 BE063844 

BE154075 BE153973 BE064861 BE153852 BE153847 BE064684 BE153602 BE065O75 BE154018 BE064772 BE064842 BE153557 BE153509 
BE072092 BE072106 BE072086 BE072098 BE072103 
BE143703 BE143631 BE143629 BE143702 
BE162704 BE162705 BE162732 BE162702 BE162694 
R00602Z42921 F06132 

M54968 NM.004985 A1808924 AL135130 AW242010 AA476848 AI740449 M17087 K03210 M35505 M35504 L00049AJ 186585 W35273 X01669 
X02825 W23635 AI554920 AI539465 AA425263 AI469981 W21091 T28976 AW977922 BE550180 AW664973 AI148939 AW1 17295 AA81 1229 
AI343010 M766141 BE219368 N95249 AA280396 AW504574 AA232870 AI770018 AA262948 AW450230 AW362890 AW609417 AW499941 
AA425857 AW380665 AA830647 AA282180 T27356 H85307 AA861543 AA356548 AA356410 AW860656 AW860647 AW938103 AW860649 
AI567016 N70374AW474707AA505084 AA082195 AW949515 AA361728 N33863 AA411821 AA401640 AW594461 AL120766 AI500024 
AW771891 H84567 D51551 AA330460 R14184 AI301629 N64676 AV659669 AI697660 AI004579 AA287927 AW453052 AW601642 AA676681 
AA737010 AA872481 AA281094 AA564243 BE464958 BE049265 AW167917AA843916 AA525301 AI015987 N25230 AI889481 AW173466 
AA937541 AI33441 6 A1676214 AI281 159 AA553559 AA582189 AA255527 AW160515 AA670007 H08199 AA808271 AA28101 5 W47527 AA649252 
AI364302 AA8B9246 R40473 H02312 AA648116 AA342730 AA243624 R99351 R41588 R49696 AA854442 F01713 AA213685AA721296 R79833 
H84241 R70668 H85554 AA223758 N95349 AI37491 3 A1306683 AA01 5609 AA9 18548 AI453570 AA772321 AI692775 AA1 95733 AI474563 
AW873048 AI209133 A1028182 AI374920 AW572807 M406223 AA833684 T97255 H69138 AA3B2906 AW1 19162 N31974 AIB90584 N39418 
AA864877 AA679469 BE350651 N41020 AI050915 F00075 AA864878 N26970 AA828898 AW01 9991 AW796631 AW993262 N48532 BE564662 
AV654063 AI754461 AW94571 2 C03289 AV655314 AV659070 AV659808 AV660435 H701 13 C05323 R91984 H96949 AV658936 AV658879 
H69137 AA384411 AA412584 C02749 W32014 R58168 C05526 BE536017 N24354 AA287991 N80109 F05452 R12740 H08297AL1 38354 
AW020801 BE178443 BE178018 BE178336 BE178360 BE178107 BE178385 BE178215 BE178186 BE178447 BE178352 BE178422 BE178424 
BE178043 8E178093 BE178460 8E178356 BE178441 BE178438 BE178467 AI091259 BE177839 BE178094 R28455 BE177844 BE178100 
AA262387 R70669 W80934 W93668 AA256711 BE178141 BE177893 BE178449 AA167718 H69694 BE178017 BE178029 BE177999 BE177936 
AA095144 N32462 AA281203 AA281183 W47526 W05015 R34165 R35396 T97366 R79640 W25258 R99450 AW368425 BE178196 R26447 
C03146C03683 

U25750AI792472 AA487379 Al 872282 AA487262 R22383 AI865750 R21832AA593628 AW571869AA377191 R78814T27193 
BE220675 AA345621 AA009992 
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TABLE 2C 

Pkey: Unique number corresponding (o an Eos probes et 

Ref. Sequence source. The 7 digit numbers in this column are Genbank Identifier {Gl) numbers. "Dunham L et at" refers to the publication entitled The DNA sequence of 

human chromosome 22.* Dunham L e! al. (1999) Nature 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 
NLposition: Indicates nucleotide positions of predicted exons. 



Pkey 


Ref 


Strand 


NLposition 


400481 


8439853 


Plus 


112433-112541 


400501 


9796227 


Minus 


12479-12619 


400713 


81 18874 


Minus 


43185-43394 


400769 


8131628 


Plus 


28671-29795 


400818 


8569994 


Plus 


1 72644-1 72765, 1 73085-1 73200 


400881 


2842777 


Minus 


91 446-91603,921 23-92265 


400882 


2842777 


Minus 


110431-110708 


400965 


7770576 


Minus 


173043-173564 


400986 


8085497 


Minus 


63140-63319 


400995 


8099094 


Plus 


141186-141601 


401093 


8516137 


Minus 


22335-23166 


401178 


9438616 


Minus 


133663-133812 


401192 


9719502 


Minus 


69559-70101 


401209 


7712287 


Plus 


164932-165112 


401405 


7768126 


Minus 


69276-69452,69548-69958 


401416 


7452889 


Minus 


121456-121626 


401419 


7452889 


Minus 


136389-136508 


401444 


8346725 


Plus 


90895-90994,93070-93213 


401512 


7622346 


Ptus 


136399-136557 


401563 


8247910 


Plus 


91395-91763 


401600 


4388746 


Minus 


27363-27518,28727-28891 ,29526-29731 


401750 


9828651 


Plus 


82143-82270,89284-89373,90596-90770,95822-96001,96688-96775,96870-96992,98046-98138 


401757 


7239630 


Plus 


88641-88751 


401839 


7656637 


Plus 


. 1016-1086,2751-2967,3241-3348,26677-26831 


401849 


7770425 


Pius 


129375-1 29483,1 29597-129720 


401952 


3319121 


Minus 


53770-53979 


401966 


3126781 


Plus 


29397-29918 


402082 


8117478 


Minus 


190046-190183 


402101 


8117697 


Plus 


134308-134487,135402-135587,136421-136548 


402106 


8131652 


Plus 


3717-3848 


402163 


8568936 


Plus 


166996-167119 


402185 


8576002 


Plus 


25486-25639 


402240 


7690131 


Plus 


104382-104527.106136-106372 


402249 


7704953 


Minus 


107636-107813.108694-108824,110435-110502,113182-113386 


402347 


8099267 


Minus 


13714-15440 


402396 


1905896 


Pius 


4426-4648 


402469 


9797107 


Minus 


71266-72351 


402532 


9800951 


Minus 


180240-180558 


402559 


9864273 


Plus 


33539-33715 


402575 


9884830 


Minus 


109742-109883 


402602 


7239666 


Plus 


6785-6972,7478-7575 


402758 


9213869 


Plus 


87638-87924 


402786 


9715046 


Plus 


47624-47795 


402807 


6456148 


Minus 


101542-101660,103476-103656 


402810 


6010110 


Ptus 


12715-12856,13527-13643 


402964 


9581599 


Minus 


46624-46784 


403046 


3540153 


Minus 


55707-55859,56369-565 1 1 


403055 


8748904 


Minus 


109532-110225 


403217 


7630969 


Rus 


54089-54163,55427-55623 


403218 


7630969 


Plus 


58039-5B149 


403291 


7230870 


Plus 


95177-95435 


403328 


8469086 


Minus 


120428-120703 


403654 


8736093 


Minus 


28634-28758 


403704 


4982546 


Minus 


8850-8996 


403708 


5705981 


Minus 


134394-134812 


hUJ/ £3 


/ 9J*twO 1 


Pius 


86737-86843 


403739 


7630882 


Plus 


4456344766,48209-48483,52255-52495 


403740 


7630882 


Plus 


86504-87227 


403745 


7652036 


Minus 


676KW8002 


403746 


7652036 


Ptus 


93612-93887 


403885 


7710403 


Minus 


53259-53524 


403893 


7710581 


Minus 


5435-7846 


403947 


7711923 


Plus 


38657-38817 


404039 


8698763 


Plus 


81 889-82011 


404054 


3548785 


Rus 


66713-69175 


404058 


3548785 


Plus 


99397-101808 


404108 


8247074 


Minus 


63603-64942 


404211 


5006246 


Plus 


185728-185685,194575-194686 


404277 


1834458 


toSnus 


91665-91946 


404384 


8887028 


Minus 


38055-38156.4217S42391.43435-43553 


404407 


7329316 


Minus 


48154-48499 


404489 


8113772 


Pius 


98183-98480 


404527 


8152087 


Plus 


127737-127796,128080-128210,129888-130054,132545-132889 
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404528 


8152087 


Plus 


135325*135486 


404661 


9797073 


Plus 


33374-33675.33769-34008 


404663 


9797133 


Plus 


29885-30514 


404956 


7387343 


Plus 


55883-56203 


405011 


6139150 


Plus 


117359-117612 


405044 


7596797 


Minus 


98903-101141 


405102 


8076881 


Minus 


120922-121296 


405109 


8096886 


Minus 


30301-30518 


405188 


6649489 


Pius 


134573-134678 


405231 


7249032 


Minus 


109793-109969 


405365 


2275192 


Minus 


119867-12037^120481-120824,121029-121357 


405387 


6587915 


Minus 


3769-3833,570^5895 


405396 


6624129 


Minus 


89965-90273 


405429 


7321905 


Minus 


51577-51723 


405435 


7408068 


Minus 


51704-51841,53581-53767 


405446 


7582529 


Phis 


99136-99313 


405503 


9211311 


Minus 


51198-51314 


405525 


9558552 


Minus 


19699-19828 


405529 


9581957 


Minus 


38944-39213 


405610 


5757553 


Minus 


71907-72080 


405802 


5924004 


Minus 


27743-28264 


405811 


4902753 


Plus 


5128-5248 


406180 


7283201 


Minus 


38923-39107 


406207 


5923650 


Minus 


162607-162800 


406302 


8575868 


Plus 


168961-1 691 50, 1 69610-1 69769 
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Table 3A shows the Seq ID No, Pkey, ExAccn, UnigenelD, and Unigene Title for all of the sequences in Table 4. 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

Seq ID No: Seq ID number correlation for those sequences in Table 4 



Pkey 


ExAccn 


UnigenelD 


Unigene Tille 


Seq ID No 


415539 


A1733881 


Hs.72472 


BMP-RIB 


Seq ID No 1 & 2 


448988 


Y09763 


Hs.22785 


gamma-aminobulyric acid (GABA) A recepto 
NM_001076*:Homo sapiens UDP glycosyltran 


Seq ID No 3-10 


403740 






Seq ID No 11 & 12 


408633 


AW963372 


Hs.46677 


PRO2000 protein 


Seq ID No 13 & 14 


408660 


AA525775 




ESTs, Moderately similar to PC4259 ferri 


Seq ID No 15 & 16 


409051 


AA080912 




gb:zn04d03rl Stratagene hNT neuron (937 


Seq ID No 17 


409123 


M063403 




gb:zm04d12.s1 Stratagene comeaf stroma 


Seq ID No 18 


415787 


H01463 


Hs.93534 


ESTs 


Seq ID No 19-21 


415999 


AA172179 


Hs.294029 


ESTs 


Seq ID No 22 


416225 


AA577730 


Hs.188684 


ESTs, Weakly similar to PC4259 ferritin 


Seq ID No 23 


420757 


X78592 


Hs.99915 


androgen receptor {dihydrotestosterone r 


Seq ID No 24 & 25 


429163 


AA884766 




gb:am20a10.s1 Soares_NFLJ_GBC_S1 Homos 


Seq ID No 26 


429441 


AJ224172 


Hs.204096 


lipophiiin B (uteroglobin family member) 


Seq ID No27&28 


431099 


Y13367 


Hs.249235 


phosphoinositide-3-kinase, class 2, alph 


Seq IDNo29&30 


432432 


AA541323 


Hs.1 15831 


ESTs 


Seq ID No 31 


432435 


BE218886 


Hs.282070 


ESTs 


Seq ID No32&33 


432527 


AW975028 


Hs.102754 


ESTs 


Seq ID No 34 


435876 


AW612586 


Hs.1 60271 


G protein-coupled receptor 48 


Seq ID No 35 & 36 


438233 


W52448 


Hs.56147 


ESTs 


Seq ID No 37-40 


439569 


AW602166 


Hs.222399 


CEGP1 protein 


Seq ID No 41 & 42 


440819 


AI809444 


Hs.202108 


ESTs 


Seq ID No 43 


442832 


AW206560 


Hs.253569 


ESTs 


Seq ID No 44 


447342 


A1199268 


Hs.19322 


Homo sapiens, Similar to R1KEN cDNA 2010 


Seq ID No 45 & 46 


447499 


AW262580 


Hs.1 47674 


protocadherin beta 16 


Seq ID No47&48 


451411 


AA017492 


Hs.1 35655 


EST 


Seq ID No 49 


451720 


AW970985 


Hs.290853 


ESTs 


Seq ID No50&51 
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Table 3B shows the accession numbers for those Pkey's lacking UnigenelD's for table 3A. For each probeset is listed gene cluster number from which oligonucleotides were 
designed. Gene dusters were compiled using sequences derived from Gen bank ESTs and mRNAs. These sequences were clustered based on sequence similarity using 
Clustering and Alignment Toots (DoubleTwist, Oakland California). Genbank accession numbers for sequences comprising each duster are listed in the "Accession" column. 

5 Pkey CAT Number Accession 

408660 107294J AA525775 AA056342 AI538978 AW975281 AA664986 

409051 109699J AA080912 AA075318 M083403 AA076594 AA078992 M084926 AA081881 M113913AA113892 AA083821 AA 134801 AA082953 AA070343 

AA062835 AA075419 AA063293 AA071252 AA078900 AA062836 AW974305 
409123 110143J AA063403 AA07O823 AA070050 
10 429163 300543J AA884766 AW974271 AA592975 AA447312 
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Table 3C shows genomic positioning for those Pke/s lacking Unigene ID'S and accession numbers in table 3A. For each predicted exon is listed genomic sequence source 
used for prediction. Nucleotide locations of each predicted exon are also listed. 

_ Pkey Ref Strand NLposition 

5 403740 7630882 Plus 86504-87227 



182 



BNSDOCID: <WO G209835BA2_L> 



WO 02/098358 



PCT/US02/17594 



Table 4: 

Seq ID NO: 1 DNA sequence 

Nucleic Acid Accession #: NM_001203 

Coding sequence: 274.. 1782 



10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



i 
1 

CGCGGGGCGC 
GAGGACGCGG 
GTGAAA6GAA 
CATAACCATT 
TGCCATAAGT 
AATGTGGGCA 
TTGCGTTGTA 
GACGGATATT 
GGTTGCCTAG 
AGAAGATCAA 
CTGCCTCCAT 
ATATCTGTGA 
TATAAAAGAC 
ATTCCTCCTG 
TCAGGCCTCC 
ATTGGAAAAG 
GTGAAAGTGT 
ACAGTGTTGA 
GGGTCCTGGA 
TATCTGAAGT 
AGTGGCTTAT 
CATCGAGATC 
GACCTGGGCC 
ACTCGAGTTG 
AATCACTTCC 
GTTGCTAGGA 
CTAGTGCCCA 
CGCCCCTCAT 
ATGACAGAAT 
ACACTTGCCA 
CATCTCTGCA 
TAAGCATCCA 
CTTTCAGGGA 
TCTGTTTGTA 



11 

I 

GGAGTCGGCG 
GAGCCGGGAG 
AGGAAGATCA 
TGGCTCTGAG 
GAGAAGCAAA 
CCAAGAAAGA 
AATGCCACCA 
GTTTCACGAT 
GACTAGAAGG 
TTGAATGCTG 
TGAAAAACAG 
CTGTCTGTAG 
AAGAAACCAG 
GAGAATCCCT 
CTCTGCTGGT 
GTCGCTATGG 
TCTTCACCAC 
TGAGGCATGA 
CCCAGTTGTA 
CCACCACCCT 
GTCATTTACA 
TGAAAAGTAA 
TGGCTGTTAA 
GCACCAAACG 
AGTCTTACAT 
GATGTGTATC 
GTGACCCCTC 
TCCCAAACCG 
GCTGGGCTCA 
AAATGTCAGA 
GAAAGCCAAC 
CAGTACAAGC 
GCGACCTGGG 
GGCGGAGAAA 



21 
I 

GGGCCTCGCG 
CGCACGCGCG 
TTTCATGCCT 
CTATGACAAG 
CTTCCTTGAT 
GGATGGTGAG 
CCATTGTCCA 
GATAGAAGAG 
CTCAGATTTT 
CACAGAAAGG 
AGATTTTGTT 
TTTGCTCTTG 
ACCTCGATAC 
GAGAGACTTA 
CCAAAGGACT 
GGAAGTTTGG 
AGAGGAAGCC 
AAACATTTTG 
CCTAATCACA 
AGACGCTAAA 
CACAGAAATC 
AAACATTCTG 
ATTTATTAGT 
CTATATGCCT 
CATGGCTGAC 
AGGAGGTATA 
TTATGAGGAC 
GTGGAGCAGT 
CAATCCTGCA 
GTCCCAGGAC 
AGGT ACT CTT 
CTTGAACATC 
CAAAGACAGA 
CCGTTGGGTA 



31 
I 

GGACGCGGGC 
GGGTGGAGTT 
TGTTGATAAA 
AG AG GAAACA 
AACATGCTTT 
AGTACAGCCC 
GAAGACTCAG 
GATGACTCTG 
CAGTGTCGGG 
AACGAATGTA 
GATGGACCTA 
GTCCTTATCA 
AGCATTGGGT 
ATTGAGCAGT 
ATAGCTAAGC 
ATGGGAAAGT 
AGCTGGTTCA 
GGTTTCATTG 
GACTATCATG 
TCAATGCTGA 
TTTAGTACTC 
GTGAAGAAAA 
GATACAAATG 
CCAGAAGTGT 
ATGTATAGTT 
GTGGAAGAAT 
ATGAGGGAGA 
GATGAGTGTC 
TCAAGGCTGA 
ATTAAACTCT 
CTGTTTGTGG 
GTCCTGCTTC 
GAAGCTCCCA 
ACTTGTTCAA 



41 
I 

AGTGCGGAGA 
CAGCCTACTC 
GGTTCAGACT 
AAAAGTTAAA 
TGCGAAGTGC 
CCACCCCCCG 
TCAACAATAT 
GGTTGCCTGT 
ACACTCCCAT 
ATAAAGACCT 
TACACCACAG 
TATTATTTTG 
TAGAACAGGA 
CTCAGAGCTC 
AGATTCAGAT 
GGCGTGGCGA 
GAGAGACAGA 
CTGCAGATAT 
AAAATGGTTC 
AGTTAGCCTA 
AAGGCAAACC 
ATGGAACTTG 
AAGTTGACAT 
TGGACGAGAG 
TTGGCCTCAT 
ACCAGCTTCC 
TTGTGTGCAT 
TAAGGCAGAT 
CAGCCCTGCG 
GATAGGAGAG 
GCAGAGCAAA 
CCAGTGGGTT 
GAAGGAGAGA 
GATATGATGC 



51 
1 

CCGCGGCGCT 
TTTCTTAGAT 
TCTGCTGATT 
CTTACAAGCC 
AGGAAAATTA 
TCCAAAGGTC 
TTGCAGCACA 
GGTCACTTCT 
TCCTCATCAA 
ACACCCTACA 
GGCTTTACTT 
TTACTTCCGG 
TGAAACTTAC 
AGGAAGTGGA 
GGTGAAACAG 
AAAGGTAGCT 
AATATATCAG 
CAAAGGGACA 
CCTTTATGAT 
CTCTTCTGTC 
AGCAATTGCC 
CTGTATTGCT 
ACCACCTAAC 
CTTGAACAGA 
CCTTTGGGAG 
TTATCATGAC 
CAAGAAGTTA 
GGGAAAACTC 
GGTTAAGAAA 
GAAAAGTAAG 
AGACATCAAA 
CAGACCTCAC 
TTGATCCGTG 
AT 



Seq ID NO: 2 Protein sequence 
Protein Accession 8: NP 001194 



1 
I 

MLLRSAGKLN 
DSGLPWTSG 
GPIHHRAIjLI 
EQSQSSGSGS 
WFRETEIYQT 
MLKLAYSSVS 
TNEVDIPPNT 
EEYQL.PYHDL 
RLTALRVKKT 



11 
I 

VGTKKEDGES 
CLGLEGSDFQ 
SVTVCSDLIiV 
GLPLLVQRTI 
VltMRHENILG 
GLCHLHTE I F 
RVGTKRYMPP 
VPSDPSYEDM 
LAKMSESQDI 



21 
I 

TAPTPRPKVL 
CRDTPIPHQR 
IillLFCYFRY 
AKQIQMVKQI 
PIAADIKGTG 
STQGKPAIAH 
EVLDESLNRN 
REIVCIKKLR 
KL 



31 
I 

RCKCHHHCPE 
RS I ECCTERN 
KRQETRPRYS 
GKGRYGEVWM 
SWTQLYLITD 
RDLKSKNILV 
HFQSYIMADM 
PSFPNRWSSD 



Seq ID NO: 3 DNA sequence 

Nucleic Acid Accession #: NM_004961.2 

Coding sequence: 55.. 1575 



1 
I 

GCCAGAGCGT 
TCCAAAGTTC 
CCTCAGACTG 
CAGCCTCTGG 
AGCAGAGTTG 
GACCACAAAC 
GTCAACAGCC 
TCCCAGACCT 
AATGGCAATG 
ACCCACGAGC 
GTGTTGTACA 
CCAATGGATT 
ATGATCTACA 
TTCCAGTTTG 
GACTTCATGG 
CAAAACTATG 
ACAGAGTCTG 
TTGGGCACCT 
TATATCGCCA 



11 
I 

GAGCCGCGAC 
TTCCAGTCCT 
AATCAAAGAA 
AAAATCAGCT 
GCAAACTGCC 
TGCGCCCTGG 
TTGGTCCTCT 
GGTACGACGA 
TGGTGAGCCA 
ATGAGATCAC 
CAATTAGGAT 
CTCACTCTTG 
AGTGGGAAAA 
ATTTTACAGG 
TCATGACGAT 
TCCCTTCTTC 
CTCCAGCCCG 
TTTCTCGTAA 
TCTGCTTCGT 



21 
I 

CTCCGCGCAG 
CCTAGGCATC 
TGAAGCCTCT 
CCTCTCTGAG 
AGAAGCCTCT 
CATTGGAGAG 
CTCTATCCTA 
ACGCCTCTGT 
GCTATGGATC 
CATGCCCAAC 
GACCATTGAT 
CCCTCTATCT 
TTTCAAGCTT 
AGTGAGCAAC 
TTTCTTCAAT 
CGTGACCACG 
GACCTCTCTA 
GAATTTCCCG 



31 
I 

GTGGTCGCGC 
TTATTGATCC 
TCCCGTGATG 
GAAACAAAGT 
CGCATCCTGA 
AAGCCCACTG 
GACATGGAAT 
TACAACGACA 
CCGGACACCT 
CAGATGGTCC 
GCCGGATGCT 
TTCTCTAGCT 
GAAATCAATG 
AAAACTGAAA 
GTGAGCAGGC 
ATGCTCTCCT 
GGGATCACCT 
CGTGTCTCCT 
TGCGCTCTGT 



41 
I 

DSVNNICSTD 
ECNKDLHPTL 
IGLEQDETYI 
GKWRGEKVAV 
YHENGSLYDY 
KKNGTCCIAD 
YSFGLILWEV 
ECLRQMGKLM 



41 
I 

CGGTCTCCGC 
TCCAGTCGAG 
TTGTCTATGG 
CAACTGAGAC 
ACACTATCCT 
TGGTCACTGT 
ACACCATTGA 
CCTTTGAGTC 
TTTTTAGGAA 
GCATCTACAA 
CACTCCACAT 
TTTCCTATCC 
AGAAGAACTC 
TAATCACAAC 
GGTTTGGCTA 
GGGTTTCCTT 
CTGTTCTGAC 
ATATCACAGC 
TGGAGTTTGC 



51 
I 

GYCFTMIEED 
PPLKNRDFVD 
PPGESLRDLI 
KVFFTTEEAS 
liKSTTLDAKS 
LGLAVKFISD 
ARRCVSGGIV 
TECWAHNPAS 



51 
I 

GGAAATGTTG 
GGTCGAGGGA 
CCCCCAGCCC 
TGAGACTGGG 
GAGTAATTAT 
TGAGATCGCC 
CATCATCTTC 
TCTTGTTCTG 
TTCTAAGAGG 
GGATGGCAAG 
GCTCAGATTT 
TGAGAATGAG 
CTGGAAGCTC 
CCCAGTTGGT 
TGTTGCCTTT 
TTGGATCAAG 
CATGACCACG 
CTTGGATTTC 
TGTGCTCAAC 



60 
120 
180 
240 
30O 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 



60 
120 
180 
240 
300 
360 
420 
480 



60 
120 
180 
240 
300 
3 60 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
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10 
15 
20 
25 
30 
35 
40 
45 
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TTCCTGATCT 
AATAGCCGTG 
GAAGCTTTTG 
TGCTCAGCCC 
AAGCTGGCCT 
TGTGAGGGCA 
TACT CGAGAG 
TGCCTTAACT 
GGTCCAAGCC 
AGTTTTTCCT 
CTCCCCTACC 
TGGGCCACCT 
ACTTAGTGAT 
AATGCTGACC 
CTTTCGGCCC 
GGCACCTCAT 
TCAGATTATT 
CACTGGCATT 
TTCCTCTCTC 
ACTACCAATT 
TACTCCCTGC 
ACTTTCCCAG 
GCCAAGAAAC 
ACCCAGGGCA 
CACTGTTATA 
GTATGGCACT 
GAT AG CCTTG 
GAGTCACAGA 
ACCTTCTAGA 
CTCTGCTGGC 
GGCCTGAGGT 
GAATGAATTT 
TGTTGGGGGG 
GGAAATATGT 



ACAACCAGAC 
CCCATGCCCG 
TGTGCCAGAT 
AGCAGCCCCC 
GCTGTGAGTG 
GTACCTGGCA 
TTGTTTT CCC 
TGTAGGTACC 
CCTTGCCAAG 
GCCCCATTCC 
TGGCCCATTC 
CCCTCTTCTT 
CAGCTCCCTA 
ACCAGACAAT 
AGTTCTGGCC 
TAAGATGCTG 
ATGTTCTCAG 
ATCCCTTTAG 
TCTGCTGCTG 
CAATGCCCTT 
TTTATATGCC 
TGACTTCCCC 
TAAGGAAACT 
CACTGTCGGA 
CCCGGGGCAC 
GGAACTTTGG 
TGACATCTTT 
TTTCTGTGGG 
CCACATGATA 
ACACCAGTGG 
GCTCAGACTG 
GGACATGCCC 
TGGATAGGGT 
AAATAAATAT 



AAAAGCCCAT 
TACCCGTGCA 
TGTCACCACT 
TAGCCCAGGT 
GTGCAAGCGT 
GCAGGGCCGC 
AGTGACTTTC 
AGCTGGTACC 
GGAGTTGGGG 
CCAAACAGAA 
ACTGAGTCTT 
CAAGGAGCAT 
AAACCATGCC 
TACTGCATTT 
TCAGCCTCAA 
GGCAGCAGTA 
TTCTCTCTCC 
GAAGAGGGGG 
TGACATCTCC 
CATCCAATGG 
ACCCTCTTCC 
TAG CCCTGAC 
CGGCTTTGCA 
GTTCTATCAC 
TCTAACCATC 
CAAAGCACTT 
AGGGCAGGAT 
ACTGTGGATC 
GGGCTAGACA 
CAAGGCCCAG 
CCCCCAAGAT 
CAATGCTTCT 
GGGGTCTCCA 
ATCAGCAAAG 



GCTTCTCCTA 
CGTTCCCGAG 
GAGGGAAGTG 
AGCCCTGAGG 
TTTAAGAAGT 
CTCTGCATCC 
TTCTTCTTCA 
CTGTGGGGCA 
GAAAGCAGCA 
GCTTGCAGAG 
CTCAGCAGAC 
CCGTGATGCT 
TAAGTACAGG 
TTCCAGAAGC 
AGTGCACCGA 
TAACAGGAGG 
CTGCTACCCC 
GGGCAGCAAG 
CTCTCCTTGC 
GT AT CTATTT 
TTCTCTTTGA 
CCAGGCACTA 
ACAGGCATTA 
TTGCTTGACC 
ACAAT CAATC 
TTGACAAGTT 
TCTTATCCCC 
TCACTGGAAG 
GCTCAGTTCA 
AATGGCGACC 
CAAATCTCTC 
ATATGCTAAG 
TCTACTTTTT 
CAAAAAGAAA 



Seq ID NO: 4 Protein sequence 
Protein Accession #: NB_004952.1 



MLSKVLPVLL 
TGSRVGKLPE 
IFSQTWYDER 
GKVLYTIRMT 
KLFQFDFTGV 
IKTESAPART 
LNFLIYNQTK 
PSCSAQQPPS 
DNYSRWPPV 



11 
I 

GILblLQSRV 
ASRILNTILS 
LCYNDTPESL 
IDAGCSLHML 
SNKTEIITTP 
SLGITSVLTM 
AHASPKLRHP 
PGSPEGPRSL 
TFFFFNVLYW 



21 
I 

EGPQTESKNE 
NYDHKLRPGI 
VLNGNWSQIi 
RFPMDSHSCP 
VGDFMVMTIF 
TTLGTFSRKN 
RINSRAHART 
CSKLACCEWC 
LVCLNL 



31 
I 

ASSRDWYGP 
GEKPTWTVE 
W I PDTFFRNS 
LSFSSFSYPE 
FNVSRRFGYV 
FPRVSYITAL 
RARSRACARQ 
KRFKKYFCMV 



AACTCCGCCA 
CCTGTGCCCG 
ATGGAGAGGA 
GTCCCCGCAG 
ACTTCTGCAT 
ATGTCTACCG 
ATGTGCTCTA 
ACCTCTCCAG 
GCAGCAGCAG 
GGTTTGTCTT 
CATTTCAAAT 
CAGTGTTCAA 
CGGATTAGCT 
CCACTATTGC 
CTAGTTGCTT 
AAGAGATCCC 
TTTCTCTGCA 
AGAGCCTATT 
TGGCTCCATC 
TTGTGTGTGA 
CCCCTGTGAC 
GGCCTTGGTG 
CTCGCCATTG 
CCTGGACCCA 
AATCAAATTC 
GTGTCTGATT 
ATTT TGCAG A 
CTATCCAAGA 
CCATGATTCT 
TCTCTTTAGC 
CTGGCTGTAG 
TGAAATCTGT 
GTCACCATCA 
AAAAAAAA 



41 
I 

QPQPLENQLL 
IAVNSLGPLS 
KRTHEHEITM 
NEMIYKWENF 
AFQNYVPSSV 
DFYIAICFVF 
HQEAFVCQIV 
PDCEGSTWQQ 



TCCTCGTATC 
CCAACATCAG 
GCGCCCGTCT 
CCTCTGCTCC 
GGTCCCCGAT 
CCTGGATAAC 
CTGGCTTGTT 
TTCCCCAGGA 
GAGCGACTAG 
TGCTGCCCCT 
TATTAATAAA 
AACCACAGCC 
ATCTTCCAAC 
CTTTGTAGTG 
GCCTATACCT 
TCTCCTTTGG 
GATAGATAGA 
TGGGACAGCA 
TTTCGTCTGC 
TTATAGTAAC 
TCTTTCTGTA 
ACTTCCTGGG 
ATTGGTGCCC 
TAAACCAGTC 
CCTTAAATTT 
GGAGCTTCAT 
TGAAAACCCT 
GCCCACTGTC 
CTTCTGTCAC 
TCAATTTCTG 
TAACCCAGTG 
GTCTGTAATT 
TCTGAAATGG 



51 
I 

SEETKSTETE 
ILDMEYTIDI 
PMQMVRIYKD 
KLEINEKNSW 
TTMLSWVSFW 
CFCAT.TiKFAV 
TTEGSDGEER 
GRLCIHVYRL 



Seq ID NO: 5 DNA sequence 

Nucleic Acid Accession #: NM_021984. 

Coding sequence: 572.. 1753 



GCCAGAGCGT 
TCCAAAGTTC 
CAGAGAAGTG 
GTGTAAAGAA 
CACTGCCTCC 
TCAGACTGAA 
GCCTCTGGAA 
CAGAGTTGGC 
CCACAAACTG 
CAACAGCCTT 
CCAGACCTGG 
TGGCAATGTG 
CCACGAGCAT 
GTTGTACACA 
AATGGATTCT 
GATCTACAAQ 
CCAGTTGGAT 
CTTCATGGTC 
AAACTATGTC 
AGAGTCTGCT 
GGGCACCTTT 
TATCGCCATC 
CCTGATCTAC 
TAGCCGTGCC 
AGCTTTTGTG 
CTCAGCCCAG 
GCTGGCCTGC 



11 
I 

GAGCCGCGAC 
TTCCAGTCCT 
CTCAAATCAT 
AGCCAAATCA 
CAGCAAAGGC 
TCAAAGAATG 
AATCAGCTCC 
AAACTGCCAG 
CGCCCTGGCA 
GGTCCTCTCT 
TACGACGAAC 
GTGAGCCAGC 
GAGATCACCA 
ATTAGGATGA 
CACTCTTGCC 
TGGGAAAATT 
TTTACAGGAG 
ATGACGATTT 
CCTTCTTCCG 
CCAGCCCGGA 
TCTCGTAAGA 
TGCTTCGTCT 
AACCAGACAA 
CATGCCCGTA 
TGCCAGATTG 
CAGCCCCCTA 
TGTGAGTGGT 



21 

I 

CTCCGCGCAG 
CCTAGGCATC 
AAGTGTACAG 
AGGACCCGAA 
AGCACTATCC 
AAGCCTCTTC 
TCTCTGAGGA 
AAGCCTCTCG 
TTGGAGAGAA 
CTAT CCTAGA 
GCCTCTGTTA 
TATGGATCCC 
TGCCCAACCA 
CCATTGATGC 
CTCTATCTTT 
TCAAGCTTGA 
TGAGCAACAA 
TCTTCAATGT 
TGACCACGAT 
CCTCTCTAGG 
ATTTCCCGCG 
TCTGCTTCTG 
AAGCCCATGC 
CCCGTGCACG 
TCACCACTGA 
GCCCAGGTAG 
GCAAGCGTTT 



31 
I 

GTGGTCGCGC 
TTATTGATCC 
CTGATGAGTT 
TGTGAGCAGG 
GGACTTCTAA 
CCGTGATGTT 
AACAAAGTCA 
CATCCTGAAC 
GCCCACTGTG 
CATGGAATAC 
CAACGACACC 
GGACACCTTT 
GATGGTCCGC 
CGGATGCTCA 
CTCTAGCTTT 
AATCAATGAG 
AACTGAAATA 
GAGCAGGCGG 
GCTCTCCTGG 
GATCACCTCT 
TGTCTCCTAT 
CGCTCTGTTG 
TTCTCCTAAA 
TTCCCGAGCC 
GGGAAGTGAT 
CCCTGAGGGT 
TAAGAAGTAC 



41 

1 

CGGTCTCCGC 
TCCAGTCGAG 
GTCAAAAAAT 
ACCTCAGAAG 
CACCATCGGG 
GTCTATGGCC 
ACTGAGACTG 
ACTATCCTGA 
GTCACTGTTG 
ACCATTGACA 
TTTGAGTCTC 
TTTAGGAATT 
ATCTACAAGG 
CTCCACATGC 
TCCTATCCTG 
AAGAACTCCT 
ATCACAACCC 
TTTGGCTATG 

GTTCTGACCA 
ATCACAG CCT 
GAGTTTGCTG 
CTCCGCCATC 
TGTGCCCGCC 
GGAGAGGAGC 
CCCCGCAGCC 
TTCTGCATGG 



1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 



60 
120 
180 
240 
300 
360 
420 
480 



51 
I 

GGAAATGTTG 
AACATGTATA 
GACCACAGCG 
CCCCCTTTGT 
TCGAGGGACC 
CCCAGCCCCA 
AGACTGGGAG 
GTAATTATGA 
AGATCTCCGT 
TCATCTTCTC 
TTGTTCTGAA 
CTAAGAGGAC 
ATGGCAAGGT 
TCAGATTTCC 
AGAATGAGAT 
GGAAGCTCTT 
CAGTTGGTGA 
TTGCCTTTCA 
GG AT CAAGAC 
TGACCACGTT 
TGGATTTCTA 
TGCTCAACTT 
CTCGTATCAA 
AACATCAGGA 
GCCCGTCTTG 
TCTGCTCCAA 
TCCCCGATTG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
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TGAGGGCAGT 
CTCGAGAGTT 
CCTTAACTTG 
TCCAAGCCCC 
TTTTTOCTGC 
CCCCTACCTG 
GGCCACCTCC 
TTAGTGATCA 
TGCTGACCAC 
TTCGGCCCAG 
CACCTCATTA 
AGATTATTAT 
CTGGCATTAT 
CCTCTCTCTC 
TACCAATTCA 
CTCCCTGCTT 
TTTCCCAGTG 
CAAGAAACTA 
CCAGGGCACA 
CTGTTATACC 
ATGGCACTGG 
TAGCCTTGTG 
GT CACAGATT 
CTTCTAGACC 
CTGCTGGCAC 
CCTGAGGTGC 
ATGAATTTGG 
TTGGGGGGTG 
AAATATGTAA 



ACCTGGCAGC 
GTTTTCCCAG 
TAGGTACCAG 
TTGCCAAGGG 
CCCATTCCCC 
GCCCATTCAC 
CTCTTCTTCA 
GCTCCCTAAA 
CAGACAATTA 
TTCTGGCCTC 
AGATGCTGGG 
GTTCTCAGTT 
CCCTTTAGGA 
TGCTGCTGTG 
ATGCCCTTCA 
TATATGCCAC 
ACTTCCCCTA 
AGGAAACTCG 
CTGTCGGAGT 
CGGGGCACTC 
AACTTTGGCA 
ACATCTTTAG 
TCTGTGGGAC 
ACATGATAGG 
ACCAGTGGCA 
TCAGACTGCC 
ACATGCCCCA 
GATAGGGTGG 
ATAAATATAT 



AGGCCCGCCT 
TGACTTTCTT 
CTGGTACCCT 
AGTTGGGGGA 
AAACAGAAGC 
TGAGTTTTCT 
AGGAGCATCC 
ACCATGCCTA 
CTGCATTTTT 
AGCCTCAAAG 
CAGCAGTATA 
CTCTCTCCCT 
AGAGGGGGGG 
ACATCTCCCT 
TCCAATGGGT 

GCCCTGACCC 
GCTTTGCAAC 
TCTATCACTT 
TAACCATCAC 
AAGCACTTTT 
GGCAGGATTC 
TGTGGATCTC 
GCTAGACAGC 
AGGCCCAGAA 
CCCAAGATCA 
ATGCTTCTAT 
GGTCTCCATC 
CAGCAAAGC 



CTGCATCCAT 
CTTCTTCAAT 
GTGGGGCAAC 
AAGCAGCAGC 
TTGCAGAGGG 
CAGCAGACCA 
GTGATGCTCA 
AGTACAGGCG 
CCAGAAGCCC 
TGCACCGACT 
ACAGGAGGAA 
GCTACCCCTT 
GCAGCAAGAG 
CTCCTTGCTG 
ATCTATTTTT 
CTCTTTGACC 
AGGCACTAGG 
AGGCATTACT 
GCTTGACCCC 
AATCAATCAA 
GACAAGTTGT 
TTATCCCCAT 
ACTGGAAGCT 
TCAGTTCACC 
TGGCGACCTC 
AATCTCTCCT 
ATGCTAAGTG 
TACTTTTTGT 



GTCTACCGCC 
GTGCTCTACT 
CTCT CCAGTT 
AGCAGCAGGA 
TTTGTCTTTG 
TTTCAAATTA 
GTGTTCAAAA 
GATTAGCTAT 
ACTATTGCCT 
AGTTGCTTGC 
GAGATCCCTC 
TCTCTGCAGA 
AGCCTATTTG 
GCTCCATCTT 
GTGTGTGATT 
CCTGTGACTC 
CCTTGGTGAC 
CGCCATTGAT 
TGGACCCATA 
TCAAATTCCC 
GTCTGATTGG 
TTTGCAGATG 
ATCCAAGAGC 
ATGATTCTCT 
TCTTTAGCTC 
GGCTGTAGTA 
AAATCTGTGT 
CACCATCATC 



TGGATAACTA 
GGCTTGTTTG 
CCCCAGGAGG 
GCGACTAGAG 
CTGCCCCTCT 
TTAATAAATG 
CCACAGCCAC 
CTTCCAACAA 
TTGCAGTGCT 
CTATACCTGG 
TCCTTTGGTC 
TAGATAGACA 
GGACAGCATT 
TCGTCTGCAC 
ATAGTAACTA 
TTTCTGTAAC 
TTCCTGGGGC 
TGGTGCCCAC 
AACCAGTCCA 
TTAAATTTGT 
AGCTTCATGA 
AAAACCCTGA 
CCACTGTCAC 
TCTGTCACCT 
AATTTCTGGG 
ACCCAGTGGA 
CTGTAATTTG 
TGAAATGGGG 



Seq ID NO: 6 Protein sequence 
Protein Accession fc: NP 068819.1 



1 
I 

MEYTIDIIFS 
MVRIYKDGKV 
INEKNSWKLF 
LSWVSFWIKT 
ALLEFAVLNF 
GSDGEERPSC 
CIHVYRLDNY 



11 

I 

QTWYDERLCY 
LYTIRMT IDA 
QLDFTGVSNK 
E SAP ART S LG 
LIYNQTKAHA 
SAQQPPSPGS 
SRWPPVTFF 



21 
I 

NDTFESLVLN 
GCSLHMLRFP 
TEIITTPVGD 
ITSVLTMTTL 
SPKLRHPRIN 
PEGPRSLCSK 
FFNVLYWLVC 



31 

I 

GNWSQLWIP 
MDSHSCPLSF 
FMVMTIFFNV 
GTFSRKNFPR 
SRAHARTRAR 
LACCEWCKRF 
LNIi 



41 
I 

DTFFRNSKRT 
SSFSYPENEM 
SRRFGYVAFQ 
VSYITALDFY 
SRACARQHQE 
KKYFCMVPDC 



Seq ID NO: 7 DNA sequence 

Nucleic Acid Accession #: NM_021987.i 

Coding sequence: 572..16S7 



1 
I 

GCCAGAGCGT 
TCCAAAGTTC 
CAGAGAAGTG 
GTGTAAAGAA 
CACTGCCTCC 
TCAGACTGAA 
GCCTCTGGAA 
CAGAGTTGGC 
CCACAAACTG 
CAACAGCCTT 
CCAGACCTGG 
CCGCATCTAC 
CTCACTCCAC 
CTTTTCCTAT 
TGAGAAGAAC 
AATAATCACA 
GCGGTTTGGC 
CTGGGTTTCC 
CTCTGTTCTG 
CTATATCACA 
GTTGGAGTTT 
TAAACTCCGC 
AGCCTGTGCC 
TGATGGAGAG 
GGGTCCCCGC 
GTACTTCTGC 
CCATGTCTAC 
CAATGTGCTC 
CAACCTCTCC 
CAGCAG CAGC 
AGGGTTTGTC 
ACCATTTCAA 
CTCAGTGTTC 
GGCGGATTAG 



11 

I 

GAGCCGCGAC 
TTCCAGTCCT 
CTCAAATCAT 
AGCCAAATCA 
CAGCAAAGGC 
TCAAAGAATG 
AATCAGCTCC 
AAACTGCCAG 
CGCCCTGGCA 
GGTCCTCTCT 
AATTCTAAGA 
AAGGATGGCA 
ATGCTCAGAT 
CCTGAGAATG 
TCCTGGAAGC 
ACCCCAGTTG 
TATGTTGCCT 
TTTTGGATCA 
ACCATGACCA 
GCCTTGGATT 
GCTGTGCTCA 
CATCCTCGTA 
CGCCAACATC 
GAGCGCCCGT 
AGCCTCTGCT 
ATGGTCCCCG 
CGCCTGGATA 
TACTGGCTTG 
AGTTCCCCAG 
AGGAGCGACT 
TTTGCTGCCC 
ATTATTAATA 
AAAACCACAG 
CTATCTTCCA 



21 
I 

CTCCGCGCAG 
CCTAGGCATC 
AAGTGTACAG 
AGGACCCGAA 
AGCACTATCC 
AAGCCTCTTC 
TCTCTGAGGA 
AAGCCTCTCG 
TTGGAGAGAA 
CTATCCTAGA 
GGACCCACGA 
AGGTGTTGTA 
TTCCAATGGA 
AGATGATCTA 
TCTTCCAGTT 
GTGACTTCAT 
TTCAAAACTA 
AGACAGAGTC 
CGTTGGGCAC 
TCTATATCGC 
ACTTCCTGAT 
TCAATAGCCG 
AGGAAGCTTT 
CTTGCTCAGC 
CCAAGCTGGC 
ATTGTGAGGG 
ACTACTCGAG 
TTTGCCTTAA 
GAGGTCCAAG 
AGAGTTTTTC 
CTCTCCCCTA 
AATGGGCCAC 
CCACTTAGTG 
ACAATGCTGA 



31 
I 

GTGGTCGCGC 
TTATTGATCC 
CTGATGAGTT 
TGTGAGCAGG 
GGACTTCTAA 
CCGTGATGTT 
AACAAAGTCA 
CATCCTGAAC 
GCCCACTGTG 
CATGGAATAC 
GCATGAGATC 
CACAATTAGG 
TTCTCACTCT 
CAAGTGGGAA 
TGATTTTACA 
GGTCATGACG 
TGTCCCTTCT 
TGCTCCAGCC 
CTTTTCTCGT 
CATCTGCTTC 
CTACAACCAG 
TGCCCATGCC 
TGTGTGCCAG 
CCAGCAGCCC 
CTGCTGTGAG 
CAGTACCTGG 
AGTTGTTTTC 
CTTGTAGGTA 
CCCCTTGCCA 
CTGCCCCATT 
CCTGGCCCAT 
CTCCCTCTTC 
ATCAGCTCCC 
CCACCAGACA 



41 * 
I 

CGGTCTCCGC 
TCCAGTCGAG 
GTCAAAAAAT 
ACCTCAGAAG 
CACCATCGGG 
GTCTATGGCC 
ACTGAGACTG 
ACTATCCTGA 
GTCACTGTTG 
ACCATTGACA 
ACCATGCCCA 
ATGACCATTG 
TGCCCTCTAT 
AATTTCAAGC 
GGAGTGAGCA 
ATTTTCTTCA 
TCCGTGACCA 
CGGACCTCTC 
AAGAATTTCC 
GTCTTCTGCT 
ACAAAAGCCC 
CGTACCCGTG 
ATTGTCACCA 
CCTAGCCCAG 
TGGTGCAAGC 
CAGCAGGGCC 
CCAGTGACTT 
CCAGCTGGTA 
AGGGAGTTGG 
CCCCAAACAG 
TCACTGAGTT 
TTCAAGGAGC 
TAAAACCATG 
ATTACTGCAT 



51 
I 

HEHEITMPNQ 
IYKWENFKLE 
NYVPSSVTTM 
IAICFVFCFC 
AFVCQIVTTE 
EGSTWQQARL 



51 
I 

GGAAATGTTG 
AACATGTATA 
GACCACAGCG 
CCCCCTTTGT 
TCGAGGGACC 
CCCAGCCCCA 
AGACTGGGAG 
GTAATTATGA 
AGATCTCCGT 
TCATCTTCTC 
ACCAGATGGT 
ATGCCGGATG 
CTTTCTCTAG 
TTGAAATCAA 
ACAAAACTGA 
ATGTGAGCAG 
CGATGCTCTC 
TAGGGATCAC 
CGCGTGTCTC 
TCTGCGCTCT 
ATGCTTCTCC 
CACGTTCCCG 
CTGAGGGAAG 
GTAGCCCTGA 
GTTTTAAGAA 
GCCTCTGCAT 
TCTTCTTCTT 
CCCTGTGGGG 
GGGAAAGCAG 
AAGCTTGCAG 
TTCTCAGCAG 
ATCCGTGATG 
CCTAAGTACA 
TTTTCCAGAA 



1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2S20 
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2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 



60 
120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
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1260 
1320 
1360 
1440 
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1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



185 



BNSDOCID: <WO 



WO 02/098358 



PCT/US02/17594 



GCCCACTATT GCCTTTGCAG TGCTTTCGGC CCAGTTCTGG CCTCAGCCTC AAAGTGCACC 2100 

GACTAGTTGC TTGCCTATAC CTGGCACCTC ATTAAGATGC TGGGCAGCAG TATAACAGGA 2160 

GGAAGAGATC CCTCTCCTTT GGTCAGATTA TTATGTTCTC AGTTCTCTCT CCCTGCTACC 2220 

CCTTTCTCTG CAGATAGATA GACACTGGCA TTATCCCTTT AGGAAGAGGG GGGGGCAGCA 2280 

5 AGAGAGCCTA TTTGGGACAG CATTCCTCTC TCTCTGCTGC TGTGACATCT CCCTCTCCTT 2340 

GCTGGCTCCA TCTTTCGTCT GCACTACCAA TTCAATGCCC TTCATCCAAT GGGTATCTAT 2400 

TTTTGTGTGT GATTATAGTA ACTACTCCCT GCTTTATATG CCACCCTCTT CCTTCTCTTT 2460 

GACCCCTGTG ACTCTTTCTG TAACTTTCCC AGTGACTTCC CCTAGCCCTG ACCAGGCACT 2520 

AGGCCTTGGT GACTTCCTGG GGCCAAGAAA CTAAGGAAAC TCGGCTTTGC AACAGGCATT 2580 

10 ACTCGCCATT GATTGGTGCC CACCCAGGGC ACACTGTCGG AGTTCTATCA CTTGCTTGAC 2640 

CCCTGGACCC ATAAACCAGT CCACTGTTAT ACCCGGGGCA CTCTAACCAT CACAATCAAT 2700 

CAATCAAATT CCCTTAAATT TGTATGGCAC TGGAACTTTG GCAAAGCACT TTTGACAAGT 2760 

TGTGTCTGAT TGGAGCTTCA TGATAGCCTT GTGACATCTT TAGGGCAGGA TTCTTATCCC 2820 

CATTTTGCAG ATGAAAACCC TGAGTCACAG ATTTCTGTGG GACTGTGGAT CTCACTGGAA 2880 

15 GCTATCCAAG AGCCCACTGT CACCTTCTAG ACCACATGAT AGGGCTAGAC AGCTCAGTTC 2940 

ACCATGATTC TCTTCTGTCA CCTCTGCTGG CACACCAGTG GCAAGGCCCA GAATGGCGAC 3000 

CTCTCTTTAG CTCAATTTCT GGGCCTGAGG TGCTCAGACT GCCCCCAAGA TCAAATCTCT 3060 

CCTGGCTGTA GTAACCCAGT GGAATGAATT TGGACATGCC CCAATGCTTC TATATGCTAA 3120 

GTGAAATCTG TGTCTGTAAT TTGTTGGGGG GTGGATAGGG TGGGGTCTCC ATCTACTTTT 3180 

20 TGTCACCATC ATCTGAAATG GGGAAATATG TAAATAAATA TATCAGCAAA GC 

Seq ID NO: 8 Protein sequence 
Protein Accession #: NP_068822.1 

25 1 11 21 31 41 51 

I I I I I I 

MEYTIDIIFS QTWNSKRTHE HE I TMPNQMV RIYKDGKVLY TIRMTIDAGC SLHMLRFPMD 60 
SHSCPLSFSS FSYPENEMIY KWENFKLEIN EKNSWKIiFQF DFTGVSNKTE I I TTPVGDFM 120 
VMTIFFNVSR RFGYVAFQNY VPSSVTTMLS WVSFWIKTES APART SLG IT SVLTMTTLGT 180 
30 FSRKNFPRVS YITALDFYIA ICFVFCFCAL LEFAVLNFLI YNQTKAHASP KLRHPRINSR 240 

AHARTRARSR ACARQHQEAF VCQIVTTEGS DGEERPSCSA QQPPSPGSPE GPRSLCSKLA 300 
CCEWCKRFKK YFCMVPDCEG STWQQGRLCI HVYRLDNYSR WFPVTFFFF NVLYWLVCLN 3 60 
Ii 

35 Seq ID NO: 9 DNA sequence 

Nucleic Acid Accession fc: NM_021990.1 
Coding sequence: 1309. .2490 

1 11 21 31 41 51 

40 | | I I I I 

GCCAGAGCGT GAGCCGCGAC CTCCGCGCAG GTGGTCGCGC CGGTCTCCGC GGAAATGTTG 60 
TCCAAAGTTC TTCCAGTCCT CCTAGGCATC TTATTGATCC TCCAGTCGAG AACATGTATA 120 
CAGAGAAGTG CTCAAATCAT AAGTGTACAG CTGATGAGTT GTCAAAAAAT GACCACAGCG 180 
GTGTAAAGAA AGCCAAATCA AGGACCCGAA TGTGAGCAGG ACCTCAGAAG CCCCCTTTGT 240 

45 CACTGCCTCC CAGCAAAGGC AGCACTATCC GGACTTCTAA CACCATCGGT GAGTTTCATA 3 00 

CCTTGGCAGA TGGCCTTTAA CATTTTTGTT TAATTCAATT ATTCTTACTA ATCTTCTTCT 3 60 
TTTTCTTGGC TGTGGTGCAT GGCTGTGGAG CTCAGGGTGG ACTCCTGTTG GGCAGCCAGT 420 
TCCTGGATGG CTGTCTGTGG GTGGAGGACT CCTGCCTTTC CTGTTTAGAC ACCCACAAAG 480 
GCTGCTCTTT AGCCTCCTTC CCTTCATCCC CTTCCCCTGC CCCCAGTGCA ACGAGTATTA 540 

50 CACAACCAAC AAAACCGCAA AATATTCCCA CAATTTTCTG GTCCTCTCTG GGAGAGGCCG 600 

CTCTGGCTTT TCCTCTCAGC CCTGGCCCTC TGCCTGCTCC TCACTCCTGG TTGGTGCTGG 660 
TCAGGCTGAC TAGAGGCCAA GGCGACCAAC ACTAGGCAAA CGCGGCCAGC GCTCAGACAT 720 
AAATGCCCTC TTCATTTCAC GTGTAACATT CTTTTAAAAT CTAGGTCTTG GTTTTGTTGA 780 
TTTTTTCTTA AATAAAAGAG TGATCATAAA AG AGG GACAG CATAGAAAGT CCCCAAAGAG 840 

55 CAGCAAGGTT TTAAAGAAAT TCACAAGCCT AATCTGTCAC TGTCTTATAA TTTGCTATTA 900 

CCAGTCACAA TTTAACTAGG TTTTGTGTTG AAAACTTGTT TTGGTTTGCT TCTGTCCCAA 960 

GAGGCACTAG CTGGGGCCCC TACAGAGTGC AGGGCAGAGC TTCATTTTTC GTTTGAATGT 1020 

TCTAGGGTCG AGGGACCTCA GACTGAATCA AAGAATGAAG CCTCTTCCCG TGATGTTGTC 1080 

TATGGCCCCC AGCCCCAGCC TCTGGAAAAT CAGCTCCTCT CTGAGGAAAC AAAGTCAACT 1140 

60 GAGACTGAGA CTGGGAGCAG AGTTGGCAAA CTGCCAGAAG CCTCTCGCAT CCTGAACACT 1200 

ATCCTGAGTA ATTATGACCA CAAACTGCGC CCTGGCATTG GAGAGAAGCC CACTGTGGTC 12 60 

ACTGTTGAGA TCTCCGTCAA CAGCCTTGGT CCTCTCTCTA TCCTAGACAT GGAATACACC 1320 

ATTGACATCA TCTTCTCCCA GACCTGGTAC GACGAACGCC TCTGTTACAA CGACACCTTT 1380 

GAGTCTCTTG TTCTGAATGG CAATGTGGTG AGCCAGCTAT GGATCCCGGA CACCTTTTTT 1440 

65 AGGAATTCTA AGAGGACCCA CGAGCATGAG ATCACCATGC CCAACCAGAT GGTCCGCATC 1500 

TACAAGGATG GCAAGGTGTT GTACACAATT AGGATGACCA TTGATGCCGG ATGCTCACTC 1560 

CACATGCTCA GATTTCCAAT GGATTCTCAC TCTTGCCCTC TATCTTTCTC TAGCTTTTCC 1620 

TATCCTGAGA ATGAGATGAT CTACAAGTGG GAAAATTTCA AGCTTGAAAT CAATGAGAAG 1680 

AACTCCTGGA AGCTCTTCCA GTTTGATTTT ACAGGAGTGA GCAACAAAAC TGAAATAATC 1740 

70 ACAACCCCAG TTGGTGACTT CATGGTCATG ACGATTTTCT TCAATGTGAG CAGGCGGTTT 1800 

GGCTATGTTG CCTTTCAAAA CTATGTCCCT TCTTCCGTGA CCACGATGCT CTCCTGGGTT 1860 

TCCTTTTGGA TCAAGACAGA GTCTGCTCCA GCCCGGACCT CTCTAGGGAT CACCTCTGTT 1920 

CTGACCATGA CCACGTTGGG CACCTTTTCT CGTAAGAATT TCCCGCGTGT CTCCTATATC 1980 

ACAGCCTTGG ATTTCTATAT CGCCATCTGC TTCGTCTTCT GCTTCTGCGC TCTGTTGGAG 2040 

75 TTTGCTGTGC TCAACTTCCT GATCTACAAC CAGACAAAAG CCCATGCTTC TCCTAAACTC 2100 

CGCCATCCTC GTATCAATAG CCGTGCCCAT GCCCGTACCC GTGCACGTTC CCGAGCCTGT 2160 

GCCCGCCAAC ATCAGGAAGC TTTTGTGTGC CAGATTGTCA CCACTGAGGG AAGTGATGGA 2220 

GAGGAGCGCC CGTCTTGCTC AGCCCAGCAG CCCCCTAGCC CAGGTAGCCC TGAGGGTCCC 2280 

CGCAGCCTCT GCTCCAAGCT GGCCTGCTGT GAGTGGTGCA AGCGTTTTAA GAAGTACTTC 2340 

80 TGCATGGTCC CCGATTGTGA GGGCAGTACC TGGCAGCAGG GCCGCCTCTG CATCCATGTC 2400 

TACCGCCTGG ATAACTACTC GAGAGTTGTT TTCCCAGTGA CTTTCTTCTT CTTCAATGTG 2460 

CTCTACTGGC TTGTTTGCCT TAACTTGTAG GTACCAGCTG GTACCCTGTG GGGCAACCTC 2520 

TCCAGTTCCC CAGGAGGTCC AAGCCCCTTG CCAAGGGAGT TGGGGGAAAG CAGCAGCAGC 2S80 
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AGCAGGAGCG ACTAGAGTTT TTCCTGCCCC ATTCCCCAAA CAGAAGCTTG CAGAGGGTTT 2640 

GTCTTTGCTG CCCCTCTCCC CTACCTGGCC CATTCACTGA GTTTTCTCAG CAGACCATTT 2700 

CAAATTATTA ATAAATGGGC CACCTCCCTC TTCTTCAAGG AGCATCCGTG ATGCTCAGTG 2760 

TTCAAAACCA CAGCCACTTA GTGATCAGCT CCCTAAAACC ATGCCTAAGT ACAGGCGGAT 2820 

5 TAGCTATCTT CCAACAATGC TGACCACCAG ACAATTACTG CATTTTTCCA GAAGCCCACT 2860 

ATTGCCTTTG CAGTGCTTTC GGCCCAGTTC TGGCCTCAGC CTCAAAGTGC ACCGACTAGT 2940 

TGCTTGCCTA TACCTGGCAC CT CATTAAG A TGCTGGGCAG CAGTATAACA GGAGGAAGAG 3000 

ATCCCTCTCC TTTGGTCAGA TTATTATGTT CTCAGTTCTC TCTCCCTGCT ACCCCTTTCT 3060 

CTGCAGATAG ATAGACACTG GCATTATCCC TTTAGGAAGA GGGGGGGGCA GCAAGAGAGC 3120 

10 CTATTTGGGA CAGCATTCCT CTCTCTCTGC TGCTGTGACA TCTCCCTCTC CTTGCTGGCT 3180 

CCATCTTTCG TCTGCACTAC CAATTCAATG CCCTTCATCC AATGGGTATC TATTTTTGTG 3240 

TGTGATTATA GTAACTACTC CCTGCTTTAT ATGCCACCCT CTTCCTTCTC TTTGACCCCT 3300 

GTGACTCTTT CTGTAACTTT CCCAGTGACT TCCCCTAGCC CTGACCAGGC ACTAGGCCTT 3360 

GGTGACTTCC TGGGGCCAAG AAACTAAGGA AACTCGGCTT TGCAACAGGC ATTACT CGCC 3420 

15 ATTGATTGGT GCCCACCCAG GGCACACTGT CGGAGTTCTA TCACTTGCTT GACCCCTGGA 3480 

CCCATAAACC AGTCCACTGT TATACCCGGG GCACTCTAAC CATCACAATC AATCAATCAA 3540 

ATTCCCTTAA ATTTGTATGG CACTGGAACT TTGGCAAAGC ACTTTTGACA AGTTGTGTCT 3600 

GATTGGAGCT TCATGATAGC CTTGTGACAT CTTTAGGGCA GGATTCTTAT CCCCATTTTG 3660 

CAGATGAAAA CCCTGAGTCA CAGATTTCTG TGGGACTGTG GATCTCACTG GAAGCTATCC 3720 

20 AAGAGCCCAC TGTCACCTTC TAGACCACAT GATAGGGCTA GACAG CTCAG TTCACCATGA 3780 

TTCTCTTCTG TCACCTCTGC TGGCACACCA GTGGCAAGGC CCAGAATGGC GACCTCTCTT 3840 

TAGCTCAATT TCTGGGCCTG AGGTGCTCAG ACTGCCCCCA AGATCAAATC TCTCCTGGCT 3900 

GTAGTAACCC AGTGGAATGA ATTTGGACAT GCCCCAATGC TTCTATATGC TAAGTGAAAT 3960 

CTGTGTCTQT AATTTGTTGG GGGGTGGATA GGGTGGGGTC TCCATCTACT TTTTGTCACC 4020 

25 ATCATCTGAA ATGGGGAAAT ATGTAAATAA ATATATCAGC AAAGC 
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Seq ID NO: 10 Protein sequence 
Probein Accession #: NP_068830.l 

1 11 21 31 41 51 

I I I I 1 I 

MEYTIDIIFS QTWYDERIiCY NDTFESLVLN GNWSQLWIP DTFFRNSKRT HEHEITMPNQ 60 

MVRIYKDGKV LYTIRMTIDA GCSLHMLRFP MDSHSCPLSF SSFSYPENEM IYKWENFKLE 120 

INEKNSWKLF QFDFTGVSNK TEIITTPVGD FMVMTIFFNV SRRFGYVAFQ NYVPSSVTTM 180 

LSWVSFWIKT ESAPARTSLG ITSVLiTMTTLt GTFSRKNFPR VSYITALDFY IAICFVFCFC 240 

ALLEFAVLNF LIYNQTKAHA SPKLRHPRIN SRAHARTRAR SRACARQHQE AFVCQIVTTE 300 

GSDGEERPSC SAQQPPSPGS PEGPRSDCSK LACCEWCKRF KKYFCMVPDC EGSTWQQGRL 360 
CIHVYRLDNY SRWFPVTFF FFNVLYWLVC LNL 

40 Seq ID NO: 11 DNA sequence 

Nucleic Acid Accession #: NM_00107S.l 
Coding sequence: 22.. 1614 

1 11 21 31 41 51 

45 | | | | | | 

TTCGGCACGA GTAAGACCAG GATGT CTCTG AAATGGACGT CAGTCTTTCT GCTGATACAG 60 

CTCAGTTGTT ACTTTAGCTC TGGAAGCTGT GGAAAGGTGC TAGTGTGGCC CACAGAATAC 120 

AGCCATTGGA TAAATATGAA GACAATCCTG GAAGAGCTTG TTCAGAGGGG TCATGAGGTG 180 

ACTGTGTTGA CATCTTCGGC TTCTACTCTT GTCAATGCCA GTAAATCATC TGCTATTAAA 240 

TTAGAAGTTT ATCCTACATC TTTAACTAAA AATGATTTGG AAGATTCTCT TCTGAAAATT 300 

CT CGATAGAT GGATATATGG TGTTTCAAAA AATACATTTT GGTCATATTT TTCACAATTA 360 

CAAGAATTGT GTTGGGAATA TTATGACTAC AGTAACAAGC TCTGTAAAGA TGCAGTTTTG 420 

AATAAGAAAC TTATGATGAA ACTACAAGAG TCAAAGTTTG ATGTCATTCT GGCAGATGCC 480 

CTTAATCCCT GTGGTGAGCT ACTGGCTGAA CTATTTAACA TACCCTTTCT GTACAGTCTT 540 

55 CGATTCTCTG TTGGCTACAC ATTTGAGAAG AATGGTGGAG GATTTCTGTT CCCTCCTTCC 600 

TATGTACCTG TTGTTATGTC AGAATTAAGT GATCAAATGA TTTT CATGG A GAGGATAAAA 660 

AATATGATAC ATATGCTTTA TTTTGACTTT TGGTTTCAAA TTTATGATCT GAAGAAGTGG 720 

GACCAGTTTT ATAGTGAAGT TCTAGGAAGA CCCACTACAT TATTTGAGAC AATGGGGAAA 780 

GCTGAAATGT GGCTCATTCG AACCTATTGG GATTTTGAAT TTCCTCGCCC ATTCTTACCA 840 

AATGTTG AT T TTGTTGGAGG ACTTCACTGT AAACCAGCCA AACCCCTGCC TAAGGAAATG 900 

GAAGAGTTTG TGCAGAGCTC TGGAGAAAAT GGTATTGTGG TGTTTTCTCT GGGGTCGATG 960 

ATCAGTAACA TGTCAGAAGA AAGTGCCAAC ATGATTGCAT CAGCCCTTGC CCAGATCCCA 1020 

CAAAAGGTTC TATGGAGATT TGATGGCAAG AAGCCAAATA CATTAGGTTC CAATACTCGA 1080 

CTGTACAAGT GGTTACCCCA GAATGACCTT CTTGGTCATC CCAAAACCAA AGCTTTTATA 1140 

ACTCATGGTG GAACCAATGG CATCTATGAG GCGATCTACC ATGGGATCCC TATGGTGGGC 1200 

ATTCCCTTGT TTGCGGATCA ACATGATAAC ATTGCTCACA TGAAAGCCAA GGGAGCAGCC 1260 

CTCAGTGTGG ACATCAGGAC CATGTCAAGT AGAGATTTGC TCAATGCATT GAAGTCAGTC 1320 

ATTAATGACC CTGTCTATAA AGAGAATGTC ATGAAATTAT CAAGAATTCA TCATGACCAA 1380 

CCAATGAAGC CCCTGGATCG AGCAGTCTTC TGGATTGAGT TTGTCATGCG CCACAAAGGA 1440 

GCCAAGCACC TTCGAGTCGC AGCTCACAAC CTCACCTGGA TCCAGTACCA CTCTTTGGAT 1500 

GTGATAGCAT TCCTGCTGGC CTGCGTGGCA ACTGTGATAT TTATCATCAC AAAATTTTGC 1560 

CTGTTTTGTT TCCGAAAGCT TGCCAAAACA GGAAAGAAGA AGAAAAGAGA TTAGTTATAT 1620 

CAAAAGCCTG AAGTGGAATG ACTGAAAGAT GGGACTCCTC CTTTATTTCA GCATGGAGGG 1680 

TTTTAAATGG AGGATTTCCT TTTTCCTGTG ACAAAACATC TTTTCACAAC TTACCTTGTT 1740 

AAGACAAAAT TTATTTTCCA GGGATTTAAT ACGTACTTTA GTTGGAATTA TTCTATGTCA 1800 

ATGATTTTTA AGCTATGAAA AATACAATGG GGGGAAGGAT AGCATTTGGA GATATACCTA I860 

ATGTTAAATG ACGAGTTACT GGATGCAGCA CGCAACATGG CACATGTGTA TACATATGTA 1920 

GCTAACCCTT CGTTGTGCAC ATGTACCCTA AAACTTAAAG TATAATTTAA AAAAAGCAAA 1980 

AAAAAAAAAT ACCAACTCTT TTTTTTAAAC CAGGAAGGAA AATGTGAACA TGGAAACAAC 2040 
TTCTAGTATT GGATCTGAAA ATAAAGTGTC ATCCAAGCCA TAAAAAAAAA 

Seq ID NO: 12 Protein sequence 
Protein Accession #: NP 001067.1 
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1 11 21 31 41 51 

I I I I I I 

MSLKWTSVFL LIQhSCYFSS GSCGXVLVWP THYSHWINMK TILEEhVQRG HEVTVLTSSA 60 

5 STLVNASKSS AIKLEVYPTS LTKNDLEDSL LKILDRWIYG VSKNTFWSYF SQLQELCWEY 120 

YDYSNKLCKD AVLNKKLMMK LQESKFDVIL ADALNPCGEL LAELFNIPFL YSLRFSVGYT 180 

FEKNGGGFIiF PPSYVPWMS EliSDQMIFME RIKNMIHMLY FDFWFQIYDL KKWDQFYSEV 240 

LGRPTTLFET MGKAEMWLIR TYWDFEFPRP FLPNVDFVGG LHCKPAKPLP KEMEEFVQSS ' 300 

GENGIWFSIi GSMISNMSEE SANMIASALA QIPQKVLWRF DGKKPNTLGS NTRLYKWLPQ 3 60 

10 NDUjGHPKTK AFITHGGTNG IYEAIYHGIP MVGIPIiFADQ HDNIAHMKAK GAALSVDIRT 42 0 

MSSRDLLNAL KSVINDPVYK ENVMKLSRIH KDQPMKPIiDR AVFWIEFVMR HKGAKHLRVA 480 
AHNLTWIQYH SLDVIAFLLA CVATVIFIIT KFCLFCFRKL AKTGKKKKRD 

Seq ID NO: 13 DNA sequence 
15 Nucleic Acid Accession #: NM_014109.1 
Coding sequence: 651.. 1739 

1 11 21 31 41 51 

on I 1 I I I I 

ZU CTGTCATTCA TGCTTTGGAA AAGTTTACTG TATATACATT AGACATT CCT GTTCTTTTTG 60 

GAGTTAGTAC TACATCCCCT GAAGAAACAT GTGCCCAGGT GATTCGTGAA GCTAAGAGAA 120 

CAGCACCAAG TATAGTGTAT GTTCCTCATA TCCACGTGTG GTGGGAAATA GTTGGACCGA 180 

CACTTAAAGC CACATTTACC ACATTATTAC AGAATATTCC TTCATTTGCT CCAGTTTTAC 240 

TACTTGCAAC TTCTGACAAA CCCCATTCCG CTTTGCCAGA AGAGGTGCAA GAATTGTTTA 3 00 

25 TCCGTGATTA TGGAGAGATT TTTAATGTCC AGTTACCGGA TAAAGAAGAA CGGACAAAAT 360 

TTTTTGAAGA TTTAATTCTA AAACAAGCTG CTAAGCCTCC TATATCAAAA AAGAAAGCAG 420 

TTTTGCAGGC TTTGGAGGTA CTCCCAGTAG CACCACCACC TGAGCCAAGA TCACTGACAG 480 

CAGAAGAAGT GAAACGACTA GAAGAACAAG AAGAAGATAC ATTTAGAGAA CTGAGGATTT 540 

TCTTAAGAAA TGTTACACAT AGGCTTGCTA TTGACAAGCG ATTCCGAGTG TTTACTAAGC 600 

30 CTGTTGACCC TGATGAGGTT CCTGATTATG TCACTGTAAT AAAGCAACCA ATGGACCTTT 660 

CATCTGTAAT CAGTAAAATT GATCTACACA AGTATCTGAC TGTGAAAGAC TATTTGAGAG 720 

ATATTGATCT AATCTGTAGT AATGCCTTAG AATACAATCC AGATAGAGAT CCTGGAGATC 780 

GTCTTATTAG GCATAGAGCC TGTGCTTTAA GAGATACTGC CTATGCCATA ATTAAAGAAG 840 

AACTTGATGA AGACTTTGAG CAGCTCTGTG AAGAAATTCA GGAATCTAGA AAGAAAAGAG 900 

35 GTTGTAGCTC CTCCAAATAT GCCCCGTCTT ACTACCATGT GATGCCAAAG CAAAATTCCA 960 

CTCTTGTTGG TGATAAAAGA TCAGACCCAG AGCAGAATGA AAAGCTAAAG ACACCGAGTA 1020 

CTCCTGTGGC TTGCAGCACT CCTGCTCAGT TGAAGAGGAA AATTCGCAAA AAGTCAAACT 1080 

GGTACTTAGG CACCATAAAA AAGCGAAGGA AGATTTCACA GGCAAAGGAT GATAGCCAGA 1140 

ATGCCATAGA TCACAAAATT GAGAGTGATA CAGAGGAAAC TCAAGACACA AGTGTAGATC 1200 

40 ATAATGAGAC CGGAAACACA GGAGAGTCTT CGGTGGAAGA AAATGAAAAA CAGCAAAATG 12 60 

CCTCTGAAAG CAAACTGGAA TTGAGAAATA ATT CAAATAC TTGTAATATA GAGAATGAGC 1320 

TTGAAGACTC TAGGAAGACT ACAGCATGTA CAGAATTGAG AGACAAGATT GCTTGTAATG 1380 

GAGATGCTTC TAGCTCTCAG ATAATACATA TTTCTGATGA AAATGAAGGA AAAGAAATGT 1440 

GTGTTCTGCG AATGACTCGA GCTAGACGTT CCCAGGTAGA ACAGCAGCAG CTCAT CACTG 1500 

45 TTGAAAAGGC TTTGGCAATT CTTTCTCAGC CTACACCCTC ACTTGTTGTG GATCATGAGC 1560 

GATTAAAAAA TCTTTTGAAG ACTGTTGTTA AAAAAAGTCA AAACTACAAC ATATTTCAGT 1620 

TGGAAAATTT GTATGCAGTA ATCAGCCAAT GTATTTATCG GCATCGCAAG GACCATGATA 1680 

AAACATCACT TATTCAGAAA ATGGAGCAAG AGGTAGAAAA CTTCAGTTGT TCCAGATGAT 1740 

GATGTCATGG TATCGAGTAT TCTTTATATT CAGTTCCTAT TTAAGTCATT TTTGTCATGT 1800 

50 CCGCCTAATT GATGTAGTAT GAAACCCTGC ATCTTTAAGG AAAAGATTAA AATAGTAAAA 1860 

TAAAAGTATT TAAACTTTCC TGATATTTAT GTACATATTA AGATAAATGT CATGTGTAAG 1920 
ATAACTGATA AATA 

Seq ID NO: 14 Protein sequence 
55 Protein Accession #: NP_054828.1 

1 11 21 31 41 51 

I I I I I I 

MDLSSVISKI DLHKYLTVKD YLRDIDLICS NALEYNPDRD PGDRLIRHRA CALRDTAYAI 60 

60 IKEELDEDFE QLCEEIQESR KKRGCSSSKY APSYYHVMPK QNSTLVGDKR SDPEQNEKUC 120 

TPSTPVACST PAQLKRKIRK KSNWYLGT I K KRRKISQAKD DSQNAIDHKI ESDTEETQDT 180 

SVDHNETGNT GESSVEENEK QQNASESKLE LRNNSNTCNI ENEI.EDSRKT TACTELRDKI 240 

ACNGDASSSQ IIHISDENEG KEMCVLRMTR ARRSQVEQQQ LITVEKALAI LSQPTPSLW 300 

DHERLKNLLK TWKKSQNYN I FQIiENL YAV ISQCIYRHRK DHDKTSLIQK MEQEVENFSC 360 



65 



Seq ID NO: 15 DNA sequence 
Nucleic Acid Accession ft: AK001536 



70 1 11 21 31 41 51 

I I I I I I 

TATATGTGAC CTTTTTAAAA AATGAGCTGT AAGCAGTCTC CCAGACAGTA GCTCAGCCTC 60 

CAGAACTCTC TTTCTGCATA GTTGAAGACC CCTCTTCACA CAAGATGGTA GCAACAAATC 120 

ATAGGTGCAA TTGCACCAAA TTCACAGAAG ATCAATTGAA AATCCTCATC AATACCTTCA 180 

75 CTCAAAAACC TTACCCAGGT TATGCTACCA AACAAAAACT TGCTTTAGCA ATCAATGCAG 240 

AAGAGTCCAG AATCCAGATT TGGTTTCAGA ATCAAAGAGC TAGGCATGGA TTCCAGAAAA 300 

CACCAGAACC TGACTTTAGA TTTAAGCCAC AGCCATGGAC AAGATTAACC TGGTGTGGAG 360 

TTTCAAAATA GAGAAGCCAG ATGGTGTTGT ACCACCTATA GCACCTTTCA ATTACACACA 420 

ATCATCCATG CATTTATGAA AAACCCATAC CCTGGGATTG ATTCCGGAGA ACAACTTGCT 480 

80 GAAGAAATTG GTGCTTCAGA GTCAAGAGTC CAAATTTGGT TCCAAAATCA AAGATCTAGA 540 

TTTCATCTCC AGAGAAAAAG AGAACCTGTT ATGT CCTTAG AATGAGAAGA CCAGAGAAGA 600 

CCAGGGGCAA GGTTTCTGAG GGACTTCAAG GTACAGAAGA TACACAAAGT GGCACCAGCC 660 

TCACTAGCAC TCTCATTTCT CAAGAGCCAG AACATGGTGA ATACAGTCAA GTTCAGTGTA 720 
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TTTGATAATA TCAATTTGGG CCCCAAATCT CTCTCACAGT CTTCCTGGGA GTCTATTCTT 780 

CTTCCAAAAG TGCAAGCTAA GCCTTCTGAA GATGGTAAAG AACTTGGCCG GGTGTGGTGG 840 

CTCATGCCTG TAATCCCAGC ACTTTAGGAG GCTGAGGCTG GAAGATTGCT TGAGCCTAGG 900 

AGTTTGAAAC CAGTCTGAGC AACATAGTAA GACCCTGTCT CT ATT CT AAA AAACAAAATA 960 

5 AGTAAAAAGG ACTGTAGGAG GCCAAGACAG GTACAGGAGG CACCACACTA CCCTGTTGAC 1020 

ACAGCCTGGA TCCAGAGTTC AGCAGACCTT GAGACAATGA AAACAAACTT AGTAATAATC 1080 

ATTTTTCAAT CATTGCAGTA ATTATTGATT TGGACAAAAA TCAATTGACG TCAAAACCTT 1140 

AAAGTGACGT TTCTCTGCCT ATGGAGTGGT CATTCTTTTA TTCCTTTAGT TTCATAATAA 1200 

ATTTTCTTTT ACTTAAAAAA ACTTATAGTT TGATGAAGAG TGAGATATAT ACCTCATCTC 12 60 

10 AAAGAATCTT CACACACACA CTTATTAATT ACAAAAGGAA AATCAGTAAT TTTGCAGTGG 1320 

AGACATATGG CCAACTCCAC CTTACCCAAG TGGCTGAAAG TCACTGCACC AGTAATGGCA 13 80 

CAAACCAATG TGAGATGATT CCTGATATGA TACACTAAAA AGGGCACTGT CTCTTCTGCA 1440 

TGTTGCAGAC AAAAAGTGGG TAAGCTGACA CTGAAACTAA TAATTAGGCA ATGTCAAGCA 1500 

AATACAAATT CAAGTTGACA GTCTGCAAAG TAACATCCAT GTACTCTTCA ACAATGGATC 1560 

15 GACCCTAGCT ACTCAGGAGG CTGAGGTGGA ATAATTGTTT GAGGCCAGGA GTTCCAGATC 1620 

AGCCTGGGCA ACATCATGCG ACCCCATCTC TAAAAACATC TTTTTAAAAA TGAGCCAGGT 1680 

GTGGTAGCAT GCACCCGTAG TCTCAGCTAC TCAGGAGCCT GAGGCAGGAG GATGGTTTCA 1740 

ACATAGGAGA TCGAGGCTGC TGTGAGCTAT GATCGTGCTA CTGCACTCCA GCCTGGGTGA 1800 

CACAGCAAGT TCCTGTTTCC AAACAACAAC AAGAAAACAA AACAAAACAA AACAAAAAAT 1860 

20 • AGATAGAATA GTGACAATAA AAATGGAGAA AAAGTAGGCT GACTCAGGAA ATGCTTAGAA 1920 

AGTACAG CCA TACCTCAAAG ATATTGTAGA TTTGATTCGA GACCACCACA ATAAAGCAGA 1980 

TATTGCTACA AAGTGAGTCA CACAAATTGT TTTGTTTCCT TGTGAATATG AAGTTATATT 2040 

GGCTCGGTGT GATGGCTCAT GCCTATAATC CCAGTACTTT AGGAGACGGA GGCGGGAGGG 2100 

TCACTTGAGC CCAGGAATTG TGAGATCAAC CTGGGCATAT AGGGAGATCC TGTCTCTATT 2160 

25 TAAAAAAAGA AGCTATGTTT ACACTACACT ATAGTCTATT TAAAGTGTGA AATGGCGTTA 2220 

TGTCCTTAAT TTTAAAACTC TTGATGCTGG CTGGGTTCGG TGGCTCATAC CTGTAATCCC 2280 

ATCACTTTGG GAGGCCAAGA CAGGTTGATT ACTTGAATTC AGGAGTTCAA GACCAGCCTG 2340 

GACAACATGG CAAAACACGT CTTTAAAAAA AGAAAAGAAA AAAGAAAAAC AGAAAGAAAA 2400 
AGAAGAAAAA CTACTTGCTG CCCTTACTTG AAGCTCAATT ATTTAAAAC 
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Seq ID NO: 16 DNA sequence 

Nucleic Acid Accession #: CAT cluster 



1 11 21 31 41 51 

35 | I I I I I 

CT T T TTT TT T TTTTTTTTTT TAGTAGAGAC AGGGTTTCAC CATGTTAGCC AGGATGGTCT 60 

CGATCTCCTG ACCTCATGAT CTTCCTGCTT TGGCCTCCCA AAGTGCTGCG ATTACAGGCG 120 

TGAGCCACTG CACCCAGCCC AGAGTTTTTT TTAACAAGGT TCTTCTCAGC AATTCTAGTA 180 

TCCAGATATA GGCCCATCAT AGACATCACA CAAGCGTGTA CTTCATAATC CTGGTGAATA 240 

40 CAGAAGTTTC CTGGACTCCT TGATGAGCTA CTGCTTTCGC TCCTATATCA GTGTTTTCAG 300 

CTGATGTCAT TTGTGATTGT GTTTCTGACT TTCTGTAGGC AGAAAAAAAC TTTCATTTTT 360 

TTTTTGCTTA CATGCACATA AATGTAAGCG CTAATTCTTA TATTAAACTG TTTATTTCTA 420 

TAATACTTAA TTGGCTGTTT TCCTGGCTGA ACCAAACCAA GAGCATAAGG AATGATAACC 480 

TTCAAAACTG ATTAAATTAG AGATCAATAA ATGGAGCTGT TTTAATTCTA TTATTCTT CT 540 

45 TTCATAGATT AAATAGAAAA TTTTT 

Seq ID NO: 17 DNA sequence 

Nucleic Acid Accession #: CAT cluster 

50 1 11 21 31 41 51 

1111)1 

GGCACGAGAA GACGCCACAT CCCCTATTAT AGAAGAGCTA ATAAATTTCC ATGATCACAC 60 

ACTAATAATT GTTTTCCTAA TTAGCTCCTT AGT CCTCT AT AT CATCTCGC TAATATTAAC 120 

AACAAAACTA ACACATACAA GCACAATAGA TGCACAAGAA GTTGAAACCA TTTGAACTAT 180 

55 TCTACCAGCT GTAATCCTTA T CAT AATTG C TCTCCCCTCT CTACGCATTC TATATATAAT 240 

AGACGAAATC AACAACCCCG TATTAACCGT TAAAACCATA GGGCACCAAT GATACTGAAG 300 

CTACGAATAT ACTGACTATG AAGACCTATG CTTTGATTCA TATATAATCC CAACAAACGA 360 

CCTAAAACCT GGTGAACTAC GACTGCTAGA AGTTGATAAC CGAGTCGTTC TGCCAATAGA 420 

- ACTTCCAATC CGTATATTAA TTTCATCTGA AGACGTCCTC CACTCATGAG CAGTCCCCTC 480 

60 CCTAGGACTT AAAACTGATG CCATCCCAGG CCGACTAAAT CCAGCACAGT ACATCAACCG 540 

ACCAGGGTTA TTCTATGGCC AATGTCTGAA TTTGTGGTCT TACCATAGCT TTTTGCCATT 600 
GTCCTAGAAT GGGTCCCTAA AATATTTCGG NACTGGTCTG 

Seq ID NO: 18 DNA sequence 
65 Nucleic Acid Accession CAT cluster 

1 11 21 31 41 51 

I I I I I I 

GTGTACATCA GAGCAAAAAT ACAGAGTATT TATTCATTTC TTCCCACTAG AGGGACACAC 60 

70 TGTTCTTGGA CAGACAAATG AATCAT CAGT TGTCAGGAGT TGCCTTTGGA GAATGATCAA 120 

TGAACTCCTT TTCAGGGGTT GGAAATTGAT ACCAGGGTCC ATCACCTCGG GCACGCATCA 180 

GCCTTCGAAC TTCCTGCTCC TTTAACCGTA ACTCAGCCTT TTCAGATTCA ATCTGGAGGA 240 

TAGCCAGGGT TTTCTCGTAG TTCTTTTCAG GGCCATCATA GAAATTCCGG GCGATCCATC 300 

TTGATATCGG ATGCTTGTAA TACTCCCAGT GTTCAGGGAT GTAGCCTTCT GGGATTTCTG 360 

75 CAAGCTCGGC TTCACCAATA AATATGTTCA CCAGTGTTAT GCCAATTATA ACTGGGATCC 420 

CAGTCAACAT AAGGTAGAAT TTCATTAACC TCAAGAAGCG AGCGTCATAG TATAAAGAAG 480 

GCTTGACGAC AAACAGTCTC TTGCCATGTC CCCACTGTGC CGCACAGGAG CGACAGTCTT 540 

CGGAAANTCC GCGTGAGAAA ACTTCCGACT CCGAGTCTAG GACCAGCGCG GCGGCAAGAC 600 
CACGCTGTCA GCGCGGAGAC CGAANCCGCT GCAGCAGCTC ATGG CCGCCA TGG 



80 



Seq ID NO: 19 DNA sequence 

Nucleic Acid Accession #: CAT cluster 
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10 



1 11 21 31 41 SI 

I I I I I I 

TAGTCCAGTN AATTACTTTA ATTTCGCTTT TCCATAATAC TGGTATTCCA TAGAAGAAAA 60 

TCTTTTATTA ATATTCTATA CTACTACATC CGACACCAGA TGACTAAAGT TTGCAATGGT 120 

5 CCAAAATTCT GTAAACCCAT TAAATGCAAT T CAT ACTTTA TTTTGGCAGT ATTCATTTCA 180 

TCATTACTTT ATTTGGATGC TAACGCAAGT ACTTCTAAGG AAAAGCTGTC ATATAATTAC 240 

TTTAGTCAAG CATTCAGTAG AGGCAATAAT CAAACCTCTA TCCCAACATT TTACACTTGT 300 

AACAGAATGA AGGATGAGGT ACAACATACA TTTTTGGCAA TTTACTATTA AGGGCCATAA 3 60 
TCATTTTAGG GGCGCTTAGG GCCCATATAT ATATATATAT ATTTTTGGAC A 



30 
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Seq ID NO: 20 DNA sequence 
Nucleic Acid Accession #: U92072 
Coding sequence: 351.. 3701 

15 i 11 21 31 41 51 

I I I [ I I 

GCCGGGCGGC TGCGCTGAGC AGAGGCCGAG CCCGGGACGC GCCGAGGGAC TGCGGGGCTG 60 

CGGGTCATGG ATGCGGCGGC AGCGGCGCGG GACGGCGGGA GCCCGGCCGC GACCAGGTGA 120 

GGAGGCGGCG TCCGGCGCCA CTGCAGCCGC AGCGGCCTCG GAGGAAGAGG GCTCGCCGCC 180 

20 GCGCGCCCGC CCGCCGTCGC TGCCCTTCCT GTTGGGATTA TCTTCTGCTC CCCGCTGCTT 240 

CTTCGCTCCC CGCGGTCGAA GCCGCCTCTA GGCTTCAGCG GCTCGGACTC CTTGGCAGCC 3 00 

GGTGCCTCTG CTACCTGGGC CTCGTAGCTG GGAGACCCTT GGGCGAGACC ATGAGGAAAT 3 60 

TCAACATCAG GAAGGTGCTG GACGGCCTGA CCGCAGGCTC GTCCTCGGCC TCGCAACAGC 420 

AGCAACAGCA GCAGCACCCG CCTGGGAACC GGGAGCCCGA GATCCAGGAG ACGCT CCAGT 480 

25 CCGAGCACTT CCAACTCTGC AAGACTGTTC GCCATGGATT TCCCTATCAG CCCTCAGCCC 540 

TGGCCTTTGA TCCCGTTCAG AAGATCCTGG CGGTAGGAAC CCAGACTGGT GCTTTAAGGC 600 

TCTTTGGTCG TCCAGGGGTG GAATGTTATT GCCAGCACGA CAGCGGAGCG GCAGTGATTC 660 

AACTCCAGTT CCTGATTAAT GAGGGAGCCC TTGTGAGTGC CTTGGCTGAT GACACCTTAC 720 

ACTTGTGGAA TTTACGTCAG AAAAGGCCTG CTGTGCTACA TTCACTCAAA TTTTGCAGAG 780 

AAAGGGTTAC ATTTTGCCAT CTGCCTTTCC AGAGTAAGTG GCTCTATGTG GGCACGGAAC 840 

GAGGTAATAT ACATATTGTC AATGTGGAGT CCTTCACACT CTCAGGCTAC GTCATTATGT 900 

GGAATAAAGC CATCGAACTG TCATCTAAAT CTCACCCAGG ACCTGTTGTC CATATAAGTG 960 

ATAATCCCAT GGACGAGGGG AAGCTTCTGA TTGGCTTTGA ATCTGGAACA GTAGTCTTAT 1020 

GGGACCTTAA GTCAAAGAAG GCTGACTACA GATACACTTA CGACGAGGCT ATTCACTCTG 1080 

TGGCTTGGCA TCATGAAGGA AAACAGTTTA TTTGCAGTCA TTCTGATGGT ACATTGACCA 1140 

TATGGAATGT GAGGTCCCCT ACTAAACCTG TACAGACCAT CACTCCTCAC GGAAAACAGT 1200 

TAAAGGATGG GAAGAAACCC GAGCCGTGCA AGCCTATCCT CAAGGTGGAG TTCAAGACAA 1260 

CAAGATCGGG GGAACCTTTT ATTATTTTGT CGGGAGGCTT ATCATATGAT ACCGTGGGAA 1320 

GAAGACCTTG CTTAACAGTG ATGCATGGGA AAAGCACGGC AGTGCTGGAA ATGGACTATT 1380 

40 CAATTGTCGA CTTTCTCACA CTCTGTGAAA CGCCATATCC AAATGATTTT CAGGAGCCGT 1440 

ATGCTGTGGT TGTTCTCCTG GAGAAGGATT TAGTGCTGAT AGACCTGGCA CAGAATGGAT 1500 

ACCCTATATT TGAGAATCCC TACCCTTTGA GTATACACGA GTCCCCTGTT ACATGTTGTG 1560 

AATATTTTGC TGATTGTCCT GTGGACCTTA TTCCTGCACT TTATTCTGTT GGAGCTAGAC 1620 

AGAAACGTCA AGGTTACAGC AAAAAGGAAT GGCCCATCAA TGGTGGTAAT TGGGGCTTGG 1680 

45 GTGCTCAAAG TTACCCAGAA ATAATTATTA CAGGGCATGC TGATGGCTCA ATTAAATTCT 1740 

GGGATGCTTC TGCAATAACT CTACAAGTAC TGTATAAATT AAAAACATCT AAAGTATTTG 1800 

AAAAGTCAAG AAATAAAGAT GACAGACAGA ACACCGACAT TGTAGATGAA GATCCATATG 1860 

CCATTCAGAT CATCTCCTGG TGCCCAGAGA GCAGAATGCT GTGCATAGCC GGAGTGTCGG 1920 

CTCATGT CAT CATTTATAGA TTCAGCAAGC AGGAAGTGGT TACAGAAGTC ATCCCGATGC 1980 

50 TTGAAGTCCG ACTGTTATAT GAAATAAATG ATGTGGAAAC GCCGGAGGGT GAGCAGCCAC 204 0 

CCCCTTTGTC CACTCCCGTG GGCAGCTCCA CCTCTCAGCC CATCCCCCCT CAGTCT CAT C 2100 

CGTCTACCAG CAGCAGCTCA TCGGACGGGC TTCGAGATAA TGTACCGTGT TTAAAAGTTA 2160 

AAAACTCACC ACTTAAACAG TCTCCCGGCT ATCAAACAGA GCTAGTCATC CAGTTGGTGT 2220 

GGGTGGGTGG AGAACCCCCG CAGCAGATCA CCAGCCTAGC ACTCAACTCT TCCTACGGAT 2280 

55 TGGTGGTTTT CGGCAACTCC AATGGCATTG CAATGGTTGA CTACCTCCAG AAAGCAGTGC 2340 

TGCTCAACCT CAGCACCATT GAACTATACG GCT CAAATGA TCCTTATCGG AGAGAACCGA 2400 

GGTCGCCCCG CAAATCTCGA CAGCCTTCAG GAGCGGGCCT GTGTGATATT ACCGAAGGAA 2460 

CTGTCGTCCC AGAGGATCGC TGCAAATCTC CGACTTCCGC AAAGATGTCA AGGAAATTAA 2520 

GCTTGCCAAC TGATCTAAAG CCTGATTTAG ATGTGAAAGA CAATTCCTTC AGCAGATCTC 2580 

60 GGAGTTCAAG TGTGACCAGC ATTGACAAAG AGTC CCGGGA AGCCATTTCT GCTCTTCATT 2640 

TCTGTGAGAC TTTCACAAGG AAGGCAGACT CCTCCCCCTC CCCGTGCCTG TGGGTGGGAA 2700 

CCACAGTGGG AACTGCCTTT GTCATCACGC TGAATCTCCC CCTGGGGCCT GAGCAGAGAC 2760 

TGCTTCAGCC AGTGATTGTG TCTCCAAGCG GTACTATATT GAGGTTAAAA GGTGCGATCT 2820 

TGAGAATGGC ATTTCTGGAT GCCGCGGGCT GCTTAATGCC ACCTGCATAC GAACCCTGGA 2880 

65 CAGAGCACAA CGTTCCTGAA GAAAAAGACG AAAAGGAGAA ATTGAAAAAG CGGCGACCTG 2940 

TCTCAGTGTC CCCCTCCTCT TCTCAGGAAA TTAGTGAAAA CCAGTACGCA GTGATATGTT 3000 

CTGAAAAGCA AGCAAAGGTC ATCTCACTGC CAACCCAGAA CTGTGCATAC AAGCAGAACA 3060 

TCACTGAGAC GTCCTTCGTG CTCCGTGGAG ACATTGTCGC CCTGAGTAAC AGTGTCTGCC 3120 

TCGCCTGCTT CTGTGCCAAC GGCCACATTA TGACTTTCAG TTTGCCGAGC TTGAGGCCTC 3180 

70 TGCTGGATGT CTACTACCTG CCCCTTACCA ACATGCGGAT AGCCAGGACA TTCTGCTTCG 3240 

CCAACAGTGG GCAAGCCTTA TACCTTGTTT CACCTACCGA AATCCAGAGA CTCACCTACA 3300 

GTCAGGAGAC GTGTGAAAAC CTTCAGGAGA TGCTTGGTGA GCTCTTCACG CCTGTAGAAA 3360 

CACCAGAAGC ACCAAACAGA GGGTTCTTCA AAGGCTTATT TGGAGGTGGT GCACAATCTC 3420 

TTGATAGAGA AGAACTGTTT GGAG AGT CAT CCTCGGGAAA GGCGTCAAGG AGCCTTGCAC 3480 

75 AGCACATCCC GGGTCCTGGC GGGATCGAAG GTGTGAAGGG AGCCGCGTCG GGAGTGGTGG 3540 

GAGAACTGGC CCGAGCCAGG CTGGCCCTCG ACGAAAGAGG ACAGAAGCTC AGCGACTTGG 3600 

AAGAGAGGAC TGCAGCCATG ATGTCCAGTG CAGACTCGTT TTCCAAACAT GCTCATGAGA 3660 

TGATGCTGAA ATACAAAGAT AAGAAGTGGT ACCAGTTCTG ACAAGTAGCA CTCAGTAAGT 3720 

CCAGCTTCAA CCAGAAGGAA AAAGACGTTT CCTTGTTGAG GTCACTGATG TATTTGGGAA 3780 
80 AGATAACATA AAAGGGATGC ACACTGCTGA CAGCGTCTTT CCCAGCACAA TCATGCACTT 
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I I I I I I 

MRKFNIRKVL DGLTAGSSSA SQQQQQQQHP PGNREPEIQE TLQSEHFQLC KTVRHGFPYQ 60 

PSAIiAFDPVQ KIIAVGTQTG ALRIiFGRPGV ECYCQHDSGA AVIQLQFLIN EGALVSALAD 120 

DTLHLWNLRQ KRPAVLHSUC FCRBRVTFCH LPFQSKWLYV GTERGNIHXV NVESFTLSGY 180 

VIMWNKAIEL SSKSHPGPW HISDNPMDEG KLLIGFESGT WLWDUCSKK ADYRYTYDEA 240 

IHSVAWHHEG KQFICSHSDG TLTIWNVRSP TKPVQTITPH GKQLKDGKKP EPCKPIIiKVE 300 

FKTTRSGEPF IILSGGLSYD TVGRRPCLTV MHGKSTAVLE MDYSXVDFLT LCETPYPNDF 360 

QEPYAVWLL EKDLVLIDLA QNGYPIFENP YPLSIHESPV TCCEYFADCP VDLIPALYSV 420 

GARQKRQGYS KKEWPINGGN WGLGAQSYPE II ITGHADGS IKFWDASAIT LQVLYKLKTS 480 

KVFEKSRNKD DRQNTDIVDE DPYAIQIISW CPESRMLCIA GVSAHVI I YR FSKQEWTEV 540 

IPMLEVRLLY EINDVETPEG EQPPPLSTPV GSSTSQPIPP QSHPSTSSSS SDGIiRDNVPC 600 

LKVKNSPLKQ SPGYQTELVI QLVWVGGEPP QQITSLALNS SYGLWFGNS NGIAMVD YLQ * 660 

KAVLLNLSTI ELYGSNDPYR REPRSPRKSR QPSGAGLCDI TEGTWPEDR CKSPTSAKMS 720 

RKLSLPTDLK PDLDVKDNSF SRSRSSSVTS IDKESREAIS ALHFCETFTR KADSSPSPCL 780 

WVGTTVGTAF VITLNLPLGP EQRLLQFVIV SPSGTILRLK GAILRMAFLD AAGCLMPPAY 840 

EPWTEHNVPE EKDEKEKLKK RRPVSVSPSS SQEISENQYA VICSEKQAKV ISLPTQNCAY 900 

KQNITETSFV LRGD I VALSN SVCIiACFCAN GHIMTFSLPS LRPLLDVYYL PLTNMR I ART 960 

FCFANSGQAL YLVSPTEIQR LTYSQETCEN DQEMLGELFT PVETPEAPNR GFFKGLFGGG 1020 

AQSLDREELF GESSSGKASR SLAQHIPGPG GIEGVKGAAS GWGELARAR IALDERGQKL 1080 
SDLEERTAAM MSSADSFSKH AHEMMLKYKD KKWYQF 



Seq ID NO: 22 DNA sequence 
25 Nucleic Acid Accession #: CAT cluster 

1 11 21 31 41 51 

I I I I I I 

TCCCATCGGG TGAACCGTGG TCTTGTTCCG TCCGCCCACA ATCGCTCTCC AGCTTTGACG 60 

30 GCCCCGGCAA AGCCTGGCTC GTTCACAGCT CTCTCGCACC TCCTGGAGCT TCAGCTTCTT 120 

CCGTTGCAGA GAAGCTTTAT GGGCCAATTC GTTCGGCATC COGGGGGCAG GTGCGCGGTG 180 

CGCGGGGAAG AAGAGGATTT GACTGCGGTT CTCCACCCCC GGCGCCCAAC CTCCACCCCG 240 

GTGCGCGCGC TCTTCCAGGC TCCTGCTGGT CCCACTTGCC AGGAGTTAGG TCTCAGGTCA 300 

GCCTGAGCTC CTGAGACGCC CAGGCCCGGA AAGACACGTA GGGGAAACCA TCTGCTCACT 3 60 

35 TCTGTCCTGT CCGGAAGGGA TCCCTTTCTG ACGGGAAAGA AAGGCGCTAA ACAAGCACTG 420 

GCCTTGAGAT AAGCAATGCT GAAGCACTTG CAGCtCACCT ATTACCATAA ACTGACTGAG 480 

CCCTCCCTAC ACAAGCCGTA ACTACTGCTT TGATTGGACA AGAGACTGAT TTCAGTAGTT 540 

TTCTCTTGAT AAGAGACCAC TGGCCGTGGG CGGGTTCTGG ACAGTTTACA GAAGCTATGC 600 

ACTTGATTGC CTTTGTGTCC CTGCTTCACC TTTTGAAGCA TAGGGCCTAA TTATAATGTA 660 

40 TTTAAATGTT GTCTCCACCC CAAAGTGAAC ATGGGTTGCA TGTAACAGGC ATGTTTACTC 720 

AGCATGCATG CAGCAGGATC CCTTCACAAA TATTCAGAGC TCCCCCTATT CCCTGTTGAA 780 

TATGTATATG TGGCCAGCCA GAT CAACGT A AATCACTATT CGCCCTCCCC TCCCTGGAAA 840 

CCTACTTTTC GGGTTTCAGC AGGAAGCTAT GCCTCCCAGG CTTGTCGAAG AGGGCCCATT 900 

TTCGGGCTTG ATAACCCCTT TATAAAAAAA TAAAATCTCC TTTCTAAATT TAAAATACAA 960 

CCACACCACC GGCCCGCAAC TATTGGGGGG GAAAAAGAAT GAAGACACAC GGTACATAGT 1020 

TTCATGCACA TTGTTAAGGA GACAGGTGCC CCCAAGCAGG CGGACATCAC GCAGTACGCA 1080 

GCTTGAGCAT GCCGAAGACG CGAGCGACTC ATAGAACACG ACGACGCTCG CAAGGCACTA 1140 
AGCATAGCTA CTACCACTCG TCGAAGAGTC ATACACAGAT TTCTATTGGC GA 



Seq ID NO: 23 DNA sequence 

Nucleic Acid Accession #: CAT cluster 



1 11 21 31 41 51 

I I I I I I 

CTATGAATCT CGGAAATTAC TCAAACCATC AGCCTCTGCA AGAAGCAAAG TGGACGGCCG 60 

GGCGCGGTGG CTCACTCCTG GAATCCCAGC ACTTTGGGAG CCCGAGGTGG CGGGATCACG 120 

AGGTCAGGAG ATCGAGACTG TTCTGGCTAA ACCAGTGAAA CCCCCTCTCT ACTAAAAAAA 180 

TAAGAAAAGC GAAGTGCATC TCCCATAAAC GAGGTACTGC AGGAAGAAAG CAGAAAATGA 240 

^ - GACCCGAGTA CACACATGCA CGCGGGCGCC GCACACACAC ACCAGAAGAA ATGAACCAAG 300 

O0 AGGAAAGGAA ACATTTTCAA AT AAG CATTT GGAGATGGGA AAAACACCTT GAAACAGAAA 360 

TTCATAAAGT ACAGAATTTT TTTTTAAGTT AAAAAAGGAA CAATAATAGA CAGAAAATGA 420 

ATGAAAAATT AAATGTCATA TCAGAAGTGA AGATAAATTA AAAGTGGTCA AAGGAGAAGA 480 

GATCTAAATG CAAACTTAAG AAGGGGCAAT TTTTTTTTTT TTTTTTTTTG AGACGCAG CC 540 
TCACTCTGTC GC 



Seq ID NO: 24 DNA sequence 

Nucleic Acid Accession #: NM_000044.1 

Coding sequence: 1115.. 3874 

1 11 21 31 41 51 

I I I I I I 

CGAGATCCCG GGGAGCCAGC TTGCTGGGAG AGCGGGACGG TCCGGAGCAA GCCCACAGGC 60 

AGAGGAGGCG ACAGAGGGAA AAAGGGCCGA GCTAGCCGCT CCAGTGCTGT ACAGGAGCCG 120 

AAGGGACGCA CCACGCCAGC CCCAGCCCGG CTCCAGCGAC AGCCAACGCC TCTTGCAGCG 180 

CGGCGGCTTC GAAGCCGCCG CCCGGAGCTG CCCTTTCCTC TTCGGTGAAG TTTTTAAAAG 240 

CTGCTAAAGA CTCGGAGGAA GCAAGGAAAG TGCCTGGTAG GACTGACGGC TGCCTTTGTC 300 

CTCCTCCTCT CCACCCCGCC TCCCCCCACC CTGCCTTCCC CCCCTCCCCC GTCTTCTCTC 360 

CCGCAGCTGC CTCAGTCGGC TACTCTCAGC CAACCCCCCT CACCACCCTT CTCCCCACCC 420 

GCCCCCCCGC CCCCGTCGGC CCAGCGCTGC CAGCCCGAGT TTGCAGAGAG GTAACTCCCT 480 

TTGGCTGCGA GCGGGCGAGC TAGCTGCACA TTGCAAAGAA GGCTCTTAGG AGCCAGGCGA S40 

CTGGGGAGCG GCTTCAGCAC TGCAGCCACG ACCCGCCTGG TTAGAATTCC GGCGGAGAGA 600 

ACCCTCTGTT TTCCCCCACT CTCTCTCCAC CTCCTCCTGC CTTCCCCACC CCGAGTGCGG 660 

AGCAGAGATC AAAAGATGAA AAGGCAGTCA GGTCTTCAGT AGCCAAAAAA CAAAACAAAC 720 

191 



BNSDOCID: <WO 02098358A2_I_> 
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AAAAACAAAA AAGCCGAAAT AAAAGAAAAA GATAATAACT CAGTTCTTAT TTGCACCTAC 780 

TTCAGTGGAC ACTGAATTTG GAAGGTGGAG GATTTTGTTT TTTTCTTTTA AGATCTGGGC 840 

ATCTTTTGAA TCTACCCTTC AAGTATTAAG AGACAGACTG TGAGCCTAGC AGGGCAGATC 900 

TTGTCCACCG TGTGTCTTCT TCTGCACGAG ACTTTGAGGC TGTCAGAGCG CTTTTTGCGT 960 

5 GGTTGCTCCC GCAAGTTTCC TTCTCTGGAG CTTCCCGCAG GTGGGCAGCT AGCTGCAGCG 1020 

ACTACCGCAT CATCACAGCC TGTTGAACTC TTCTGAGCAA GAGAAGGGGA GGCGGGGTAA 1080 

GGGAAGTAGG TGGAAGATTC AGCCAAGCTC AAGGATGGAA GTGCAGTTAG GGCTGGGAAG 1140 

GGTCTACCCT CGGCCGCCGT CCAAGACCTA CCGAGGAGCT TTCCAGAATC TGTTCCAGAG 1200 

CGTGCGCGAA GTGATCCAGA ACCCGGGCCC CAGGCACCCA GAGGCCGCGA GCGCAGCACC 1260 

10 TCCCGGCGCC AGTTTGCTGC TGCTGCAGCA GCAGCAGCAG CAGCAGCAGC AGCAGCAGCA 1320 

GCAGCAGCAG CAGCAGCAGC AGCAGCAAGA GACTAGCCCC AGGCAGCAGC AGCAGCAGCA 1380 

GGGTGAGGAT GGTTCTCCCC AAGCCCATCG TAGAGGCCCC ACAGGCTACC TGGTCCTGGA 1440 

TGAGGAACAG CAACCTTCAC AGCCGCAGTC GGCCCTGGAG TGCCACCCCG AGAGAGGTTG 1500 

CGTCCCAGAG CCTGGAGCCG CCGTGGCCGC CAGCAAGGGG CTGCCGCAGC AGCTGCCAGC 1560 

15 ACCTCCGGAC GAGGATGACT CAGCTGCCCC ATCCACGTTG TCCCTGCTGG GCCCCACTTT 1620 

CCCCGGCTTA AGCAGCTGCT CCGCTGACCT TAAAGACATC CTGAGCGAGG CCAGCACCAT 1680 

GCAACTCCTT CAGCAACAGC AGCAGGAAGC AGTATCCGAA GGCAGCAGCA GCGGGAGAGC 1740 

GAGGGAGGCC TCGGGGGCTC CCACTTCCTC CAAGGACAAT TACTTAGGGG GCACTTCGAC 1800 

CATTTCTGAC AACGCCAAGG AGTTGTGTAA GGCAGTGTCG GTGTCCATGG GCCTGGGTGT 1860 

20 GGAGGCGTTG GAGCATCTGA GTCCAGGGGA ACAGCTTCGG GGGGATTGCA TGTACGCCCC 1920 

ACTTTTGGGA GTTCCACCCG CTGTGCGTCC CACTCCTTGT GCCCCATTGG CCGAATGCAA 1980 

AGGTTCTCTG CTAGACGACA GCGCAGGCAA GAGCACTGAA GATACTGCTG AGTATTCCCC 2040 

TTTCAAGGGA GGTTACACCA AAGGGCTAGA AGGCGAGAGC CTAGGCTGCT CTGGCAGCG C 2100 

TGCAGCAGGG AGCTCCGGGA CACTTGAACT GCCGTCTACC CTGTCTCTCT ACAAGTCCGG 2160 

25 AGCACTGGAC GAGGCAGCTG CGTACCAGAG TCGCGACTAC TACAACTTTC CACTGGCTCT 2220 

GGCCGGACCG CCGCCCCCTC CGCCGCCTCC CCATCCCCAC GCTCGCATCA AGCTGGAGAA 2280 

CCCGCTGGAC TACGGCAGCG CCTGGGCGGC TGCGGCGGCG CAGTGCCGCT ATGGGGACCT 2340 

GGCGAGCCTG CATGGCGCGG GTGCAGCGGG ACCCGGTTCT GGGTCACCCT CAGCCGCCGC 2400 

TTCCTCATCC TGGCACACTC TCTTCACAGC CGAAGAAGGC CAGTTGTATG GACCGTGTGG 2460 

30 TGGTGGTGGG GGTGGTGGCG GCGGCGGCGG CGGCGGCGGC GGCGGCGGCG GCGGCGGCGG 2520 

CGGCGGCGGC GAGGCGGGAG CTGTAGCCCC CTACGGCTAC ACTCGGCCCC CTCAGGGGCT 2580 

GGCGGGCCAG GAAAGCGACT TCACCGCACC TGATGTGTGG TACCCTGGCG GCATGGTGAG 2640 

CAGAGTGCCC TATCCCAGTC CCACTTGTGT CAAAAGCGAA ATGGGCCCCT GGATGGATAG 2700 

CTACT CCGGA CCTTACGGGG ACATGCGTTT GGAGACTGCC AGGGACCATG TTTTGCCCAT 2760 

35 TGACTATTAC TTTCCACCCC AGAAGACCTG CCTGATCTGT GGAGATGAAG CTTCTGGGTG 2820 

TCACTATGGA GCTCTCACAT GTGGAAGCTG CAAGGTCTTC TTCAAAAGAG CCGCTGAAGG 2880 

GAAACAGAAG TACCTGTGCG CCAGCAGAAA TGATTGCACT ATTGATAAAT TCCGAAGGAA 2940 

AAATTGTCCA TCTTGTCGTC TTCGGAAATG TTATGAAGCA GGGATGACTC TGGGAGCCCG 3000 

GAAGCTGAAG AAACTTGGTA ATCTGAAACT ACAGGAGGAA GGAGAGGCTT CCAGCACCAC 3060 

40 CAGCCCCACT GAGGAGACAA CCCAGAAGCT GACAGTGTCA CACATTGAAG GCTATGAATG 3120 

TCAGCCCATC TTTCTGAATG TCCTGGAAGC CATTGAGCCA GGTGTAGTGT GTGCTGGACA 3180 

CGACAACAAC CAGCCCGACT CCTTTGCAGC CTTGCTCTCT AGCCTCAATG AACTGGGAGA 3240 

GAGACAGCTT GTACACGTGG TCAAGTGGGC CAAGGCCTTG CCTGGCTTCC GCAACTTACA 3300 

CGTGGACGAC CAGATGGCTG TCATTCAGTA CTCCTGGATG GGGCTCATGG TGTTTGCCAT 3360 

45 GGGCTGGCGA TCCTTCACCA ATGTCAACTC CAGGATGCTC TACTTCGCCC CTGATCTGGT 3420 

TTTCAATGAG TACCGCATGC ACAAGTCCCG GATGTACAGC CAGTGTGTCC GAATGAGGCA 3480 

CCTCTCTCAA GAGTTTGGAT GGCTCCAAAT CACCCCCCAG GAATTCCTGT GCATGAAAGC 3540 

ACTGCTACTC TTCAGCATTA TTCCAGTGGA TGGGCTGAAA AATCAAAAAT TCTTTGATGA 3600 

ACTTCGAATG AACTACATCA AGGAACTCGA TCGTATCATT GCATGCAAAA GAAAAAATCC 3660 

50 CACATCCTGC TCAAGACGCT TCTACCAGCT CACCAAGCTC CTGGACTCCG TGCAGCCTAT 3720 

TGCGAGAGAG CTGCATCAGT TCACTTTTGA CCTGCTAATC AAGTCACACA TGGTGAGCGT 3780 

GGACTTTCCG GAAATGATGG CAGAGATCAT CTCTGTGCAA GTGCCCAAGA TCCTTTCTGG 3840 

GAAAGTCAAG CCCATCTATT TCCACACCCA GTGAAGCATT GGAAACCCTA TTTCCCCACC 3900 

CCAGCTCATG CCCCCTTTCA GATGTCTTCT GCCTGTTATA ACTCTGCACT ACTCCTCTGC 3960 

55 AGTGCCTTGG GGAATTTCCT CTATTGATGT ACAGTCTGTC ATGAACATGT TCCTGAATTC 4020 

TATTTGCTGG GCTTTTTTTT TCTCTTTCTC TCCTTTCTTT TTCTTCTTCC CTCCCTATCT 4080 

AACCCTCCCA TGGCACCTTC AGACTTTGCT TCCCATTGTG GCTCCTATCT GTGTTTTGAA 4140 

TGGTGTTGTA TGCCTTTAAA TCTGTGATGA TCCTCATATG GCCCAGTGTC AAGTTGTGCT 4200 

TGTTTACAGC ACTACTCTGT GCCAGCCACA CAAACGTTTA CTTATCTTAT GCCACGGGAA 4260 

60 GTTTAGAGAG CTAAGATTAT CTGGGGAAAT CAAAACAAAA AACAAGCAAA CAAAAAAAAA 4320 
A 

Seq ID NO: 25 Protein sequence 
Protein Accession #: NP_ 000035.1 

65 

1 11 21 31 41 SI 
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MEVQLGLGRV YPRPPSKTYR GAFQNLFQSV REVIQNPGPR HPEAASAAPP GASIiLLLQQQ 60 

_ - QQQQQQQQQQ QQQQQQQQET SPRQQQQQQG EDGSPQAHRR GPTGYLVLDE EQQPSQPQSA 120 

70 LECHPERGCV PEPGAAVAAS KGLPQQLPAP PDEDDSAAPS TLSLLGPTFP GLSSCSADLK 180 

DILSEASTMQ LLQQQQQEAV SEGSSSGRAR EASGAPTSSK DNYLGGTSTI SDNAKELCKA 240 

VSVSMGLGVE ALEHLSPGEQ LRGDCMYAPL LGVPPAVRPT PCAPLAECKG SLLDDSAGKS 300 

TEDTAEYSPF KGGYTKGIiEG ESLGCSGSAA AGSSGTLiEbP STLSLYKSGA LDEAAAYQSR 360 

DYYNFPLALA GPPPPPPPPH PHARIKLENP LDYGSAWAAA AAQCRYGDLA SLHGAGAAGP 420 

75 GSGSPSAAAS SSWHTLFTAE EGQLYGPCGG GGGGGGGGGG GGGGGGGGGG GGEAGAVAPY 480 

GYTRPPQGLA GQESDPTAPD VWYPGGMVSR VPYPSPTCVK SEMGPWMDSY SGPYGDMRLE 540 

TARDHVLiPID YYFPPQKTCL ICGDEASGCH YGALTCGSCK VPFKRAAEGK QKYLCASRND 600 

CTIDKFRRKN CPSCRLRKCY EAGMTIjGARK LKKLGNLKLQ EEGEASSTTS PTEETTQKXT 660 

VSHIEGYECQ PIFLNVLEAI EPGWCAGHD NNQPDSFAAL hSShNEhGER QLVHWKWAK 720 

80 ALPGFRNLHV DDQMAVIQYS WMGLMVFAMG WRSFTNVNSR MLYPAPDLVF NEYRMHKSRM 780 

YSQCVRMRHL SQEFGWLQIT PQEFLCMKAL LLFSIIPVDG LKNQKFFDEL RMMYIKELDR 840 

IIACKRKNPT SCSRRFYQLT KLLDSVQPIA RELHQFTFDL LIKSHMVSVD FPEMMAEIIS 900 
VQVPKILSGK VKPIYFHTQ 
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20 
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Seq ID NO: 26 DNA sequence 
Nucleic Acid Accession CAT cluster 

1 11 21 31 41 51 

I I I I I I 

AGCATTATCC ATGGCCAGTG ATTGATGGAC TTGTTCAGGT CCTATGCAGA GTGCTTCATA 60 

TATCTCATCT CAATCCTCTA AATAACCATG AAAGTTGATG ATTATCTCAT GGTACAGATG 120 

GGAGGCTAAG AGTGTTTAAT TTT CCCCAAG TTCCAGTGCT AGTAAGTGTT GNNNNNNNNN 180 

NNTGAACCTG TGTTAATGGT GTTTCTAGTC GATGCTGTTA TCTGTTGCAC CACATTTTGA 240 

ATAATCTTGG ACTTTCAGAG TATGAAGGAC GATTAAATAT AACCCTTTGG TATAAATGTT 300 

CTCTCTCTCG CTCCTCTGTA ACAATTGGAG AAACAGAGTT CTAACAATAT TAAAATCAGC 360 

CATAGACAGA GAGTAGTGAG AAATATACTT T TTTTAAT AC AGAAGGTTCC CTGAAGTACT 420 

TTTAGTATTA TTCTAAATTA AGCAATAACC AATGAACAAT TTTGGTCATA AGCAGTTTCT 480 
CTCCAGAAAA AAAAAAAAAA AGTCGAC 

Seq ID NO: 27 DNA sequence 
Nucleic Acid Accession #: NM_0065S1.2 
Coding sequence : 64 . . 3 3 6 

1 11 21 31 41 51 

I I I I I I 

AATTCTAGAA GTCCAAATCA CTCATTGTTT GTGAAAGCTG AGCTCACAGC AAAACAAGCC 60 

ACCATGAAGC TGTCGGTGTG TCTCCTGCTG GTCACGCTGG CCCTCTGCTG CTACCAGGCC 120 

AATGCCGAGT TCTGCCCAGC TCTTGTTTCT GAGCTGTTAG ACTTCTTCTT CATTAGTGAA 180 

CCTCTGTTCA AGT TAAGTCT TGCCAAATTT GATGCCCCTC CGGAAGCTGT TGCAGCCAAG 240 

TTAGGAGTGA AGAGATGCAC GGATCAGATG TCCCTTCAGA AACGAAGCCT CATTGCGGAA 300 

GTCCTGGTGA AAATATTGAA GAAATGTAGT GTGTGACATG TAAAAACTTT CATCCTGGTT 360 

TCCACTGTCT TTCAATGACA CCCTGATCTT CACTGCAGAA TGTAAAGGTT TCAACGTCTT 420 
GCTTTAATAA ATCACTTGCT CTAC 

Seq ID NO : 2 8 Protein sequence 
Protein Accession 8: NP_ 006542.1 

1 11 21 31 41 51 

I I I I I I 

MKLSVCLLLV TIiALCCYQAN AEFCPALVSE LLDPFFISEP LFKLSLAKFD APPEAVAAKL 60 
GVKRCTDQMS LQKRSLIAEV LVKILKKCSV 

40 Seq ID NO: 2 9 DNA sequence 

Nucleic Acid Accession #: NM_002645.1 
Coding sequence: 1..5061 

1 11 21 31 41 51 

45 | j | | | | 

ATGGCTCAGA TATTTAGCAA CAGCGGATTT AAAGAATGTC CATTTTCACA TCCGGAACCA 60 

ACAAGAGCAA AAGATGTGGA CAAAGAAGAA GCATTACAGA TGGAAGCAGA GGCTTTAGCA 120 

AAACTGCAAA AGGATAGACA AGTGACTGAC AATCAGAGAG GCTTTGAGTT GTCAAGCAGC 180 

ACCAGAAAAA AAGCACAGGT TTATAACAAG CAGGATTATG ATCTCATGGT GTTTCCTGAA 240 

TCAGATTCCC AAAAAAGAGC ATTAGATATT GATGTAGAAA AGCTCACCCA AGCTGAACTT 300 

GAGAAACTAT TGCTGGATGA CAGTTTCGAG ACTAAAAAAA CACCTGTATT ACCAGTTACT 360 

CCTATTCTGA GCCCTTCCTT TTCAGCACAG CTCTATTTTA GACCTACTAT TCAGAGAGGA 420 

CAGTGGCCAC CTGGATTACC TGGGCCTTCC ACTTATGCTT TACCTTCTAT TTATCCTTCT 480 

ACTTACAGTA AACAGGCTGC ATTCCAAAAT GGCTTCAATC CAAGAATGCC CACTTTTCCA 540 

TCTACAGAAC CTATATATTT AAGTCTTCCG GGACAATCTC CATATTTCTC ATATCCTTTG 600 

ACACCTGCCA CACCCTTTCA TCCACAAGGA AGCTTACCTA TCTATCGTCC AGTAGTCAGT 660 

ACTGACATGG CAAAACTATT TGACAAAATA GCTAGTACAT CAGAATTTTT AAAAAATGGG 720 

AAAGCAAGGA CTGATTTGGA GATAACAGAT TCAAAAGTCA GCAATCTACA GGTATCTCCA 780 

AAGTCTGAGG ATATCAGTAA ATTTGACTGG TTAGACTTGG ATCCTCTAAG TAAGCCTAAG 840 

GTGGATAATG TGGAGGTATT AGACCATGAG GAAGAGAAAA ATGTTTCAAG TTTGCTAGCA 900 

AAGGATCCTT GGGATGCTGT TCTTCTTGAA GAGAGATCGA CAGCAAATTG TCATCTTGAA 960 

AGAAAGGTGA ATGGAAAATC CCTTTCTGTG GCAACTGTTA CAAGAAGCCA GTCTTTAAAT 1020 

ATTCGAACAA CTCAGCTTGC AAAAGCCCAG GGCCATATAT CTCAGAAAGA CCCAAATGGG 1080 

ACCAGTAGTT TGCCAACTGG AAGTTCTCTT CTTCAAGAAG TTGAAGTACA GAATGAGGAG 1140 

ATGGCAGCTT TTTGTCGATC CATTACAAAA TTGAAGACCA AATTTCCATA TACCAATCAC 1200 

CGCACAAACC CAGGCTATTT GTTAAGTCCA GTCACAGCGC AAAGAAACAT ATGCGGAGAA 1260 

AATGCTAGTG TGAAGGTCTC CATTGACATT GAAGGATTTC AGCTACCAGT TACTTTTACG 1320 

TGTGATGTGA GTTCTACTGT AGAAATCATT ATAATGCAAG CCCTTTGCTG GGTACATGAT 1380 

GACTTGAATC AAGTAGATGT TGGCAGCTAT GTTCTAAAAG TTTGTGGTCA AGAGGAAGTG 1440 

CTGCAGAATA ATCATTGCCT TGGAAGTCAT GAGCATATTC AAAACTGTCG AAAATGGGAC 1500 

ACAGAAATTA GACTACAACT CTTGACCTTC AGTGCAATGT GTCAAAATCT GGCCCGAACA 1560 

GCAGAAGATG ATGAAACACC CGTGGATTTA AACAAACACC TGTATCAAAT AGAAAAACCT 1620 

TGCAAAGAAG CCATGACGAG ACACCCTGTT GAAGAACTCT TAGATTCTTA TCACAACCAA 1680 

GTAGAACTGG CTCTTCAAAT TGAAAACCAA CACCGAGCAG TAGATCAAGT AATTAAAGCT 1740 

GTAAGAAAAA TCTGTAGTGC TTTAGATGGT GTCGAGACTC TTGCCATTAC AGAATCAGTA 1800 

AAGAAGCTAA AGAGAGCAGT TAATCTTCCA AGGAGTAAAA CTGCTGATGT GACTTCTTTG 1860 

TTTGGAGGAG AAGACACTAG CAGGAGTTCA ACTAGGGGCT CACTTAATCC TGAAAATCCT 1920 

GTTCAAGTAA GCATAAACCA ATTAACTGCA GCAATTTATG ATCTTCTCAG ACTCCATGCA 1980 

AATTCTGGTA GGAGTCCTAC AGACTGTGOC CAAAGTAGCA AGAGTGTCAA GGAAGCATGG 2040 

80 ACTACAACAG AGCAGCT CCA GTTTACTATT TTTGCTGCTC ATGGAATTTC AAGTAATTGG 2100 

GTATCAAATT ATGAAAAATA CTACTTGATA TGTTCACTGT CTCACAATGG AAAGGATCTT 2160 

TTTAAACCTA TTCAATCAAA GAAGGTTGGC ACTTACAAGA ATTTCTTCTA TCTTATTAAA 2220 

TGGGATGAAC TAATCATTTT TCCTATCCAG ATATCACAAT TGCCATTAGA ATCAGTTCTT 2280 
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50 
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CACCTTACTC TTTTTGGAAT TTTAAATCAG AGCAGTGGAA GTTCC CCTGA TT CTAAT AAG 2340 

CAGAGAAAGG GACCAGAAGC TTTGGGCAAA GTTTCTTTAC CTCTTTGTGA CTTTAGACGG 2400 

TTTTTAACAT GTGGAACTAA ACTTCTATAT CTTTGGACTT CATCACATAC AAATTCTGTT 2 460 

CCTGGAACAG TTACCAAAAA AGGATATGTC ATGGAAAGAA TAGTGCTACA GGTTGATTTT 2520 

5 CCTTCTCCTG CATTTGATAT TATTTATACA ACTCCTCAAG TTGACAGAAG CATTATACAG 2580 

CAACATAACT TAGAAACACT AGAGAATGAT ATAAAAGGGA AACTTCTTGA TATTCTTCAT 2640 

AAAGACTCAT CACTTGGACT TTCTAAAGAA GATAAAGCTT TTTTATGGGA GAAACGTTAT 2700 

TATTGCTTCA AACACCCAAA TTGTCTTCCT AAAATATTAG CAAGCGCCCC AAACTGGAAA 2760 

TGGGGTAATC TTGCCAAAAC TTACTCATTG CTTCACCAGT GGCCTGCATT GTACCCACTA 2820 

10 ATTGCATTGG AACTTCTTGA TTCAAAATTT GCTGATCAGG AAGTAAGATC CCTAGCTGTG 2880 

ACCTGGATTG AGGCCATTAG TGATGATGAG CTAACAGATC TTCTTCCACA GTTTGTACAA 2940 

GCTTTGAAAT ATGAAATTTA CTTGAATAGT TCATTAGTGC AATTCCTTTT GTCCAGGGCA 3000 

TTGGGAAATA TCCAGATAGC ACACAATTTA TATTGGCTTC TCAAAGATGC CCTGCATGAT 3060 

GTACAGTTTA GTACCCGATA CGAACATGTT TTGGGTGCTC TCCTGTCAGT AGGAGGAAAA 3120 

15 CGACTTAGAG AAGAACTTCT AAAACAGACG AAACTTGTAC AGCTTTTAGG AGGAGTAGCA 3180 

GAAAAAGTAA GGCAGGCTAG TGGATCAGCC AGACAGGTTG TTCTCCAAAG AAGTATGGAA 3240 

CGAGTACAGT CCTTTTTTCA GAAAAATAAA TGCCGTCTCC CTCTCAAGCC AAGTCTAGTG 3300 

GCAAAAGAAT TAAATATTAA GTCGTGTTCC TTCTTCAGTT CTAATGCTGT CCCCCTAAAA 3360 

GTCACAATGG TGAATGCTGA CCCTCTGGGA GAAGAAATTA ATGTCATGTT TAAGGTTGGT 3420 

20 GAAGATCTTC GGCAAGATAT GTTAGCTTTA CAGATGATAA AGATTATGGA TAAGATCTGG 3480 

CTTAAAGAAG GACTAGATCT GAGGATGGTA ATTTTCAAAT GTCTCTCAAC TGGCAGAGAT 3S40 

CGAGGCATGG TGGAGCTGGT TCCTGCTTCC GATACCCTCA GGAAAATCCA AGTGGAATAT 3 600 

GGTGTGACAG GATCCTTTAA AGATAAACCA CTTGCAGAGT GGCTAAGGAA ATACAATCCC 3660 

TCTGAAGAAG AATATGAAAA GGCTTCAGAG AACTTTATCT ATTCCTGTGC TGGATGCTGT 3720 

25 GTAGCCACCT ATGTTTTAGG CAT CTGTGAT CGACACAATG ACAATATAAT GCTTCGAAGC 3780 

ACGGGACACA TGTTTCACAT TGACTTTGGA AAGTTTTTGG GACATGCACA GATGTTTGGC 3840 

AGCTTCAAAA GGGATCGGGC TCCTTTTGTG CTGACCTCTG ATATGGCATA TGTCATTAAT 3900 

GGGGGTGAAA AGCCCACCAT TCGTTTTCAG TTGTTTGTGG ACCTCTGCTG TCAGGCCTAC 3960 

AACTTGATAA GAAAGCAGAC AAACCTTTTT CTTAACCTCC TTTCACTGAT GATTCCTTCA 4020 

30 GGGTTACCAG AACTTACAAG TATTCAAGAT TTGAAATACG TTAGAGATGC ACTTCAACCC 4080- 

CAAACTACAG ACGCAGAAGC TACAATTTTC TTTACTAGGC TTATTGAATC AAGTTTGGGA 4140 

AGCATTGCCA CAAAGTTTAA CTTCTTCATT CACAACCTTG CTCAGCTTCG TTTTTCTGGT 4200 

CTTCCTTCTA ATGATGAGCC CATCCTTTCA TTTTCACCTA AAACATACTC CTTTAGACAA 4260 

GATGGTCGAA TCAAGGAAGT CTCTGTTTTT ACATATCATA AGAAATACAA CCCAGATAAA 4320 

35 CATTATATTT ATGTAGTCCG AATTTTGTGG GAAGGACAGA TTGAACCATC ATTTGTCTTC 4380 

CGAACATTTG TCGAATTTCA GGAACTTCAC AATAAGCTCA GTATTATTTT TCCACTTTGG 4440 

AAGTTACCAG GCTTTCCTAA TAGGATGGTT CTAGGAAGAA CACACATAAA AGATGTAGCA 4500 

GCCAAAAGGA AAATTGAGTT AAACAGTTAC TTACAGAGTT TGATGAATGC TTCAACGGAT 4560 

GTAGCAGAGT GTGATCTTGT TTGTACTTTC TTCCACCCTT TACTTCGTGA TGAGAAAGCT 4620 

40 GAAGGGATAG CTAGGTCTGC AGATGCAGGT TCCTTCAGTC CTACTCCAGG CCAAATAGGA 4680 

GGAGCTGTGA AATTATCCAT CTCTTACCGA AATGGTACTC TTTTCATCAT GGTGATGCAT 4740 

ATCAAAGATC TTGTTACTGA AGATGGAGCT GACCCAAATC CATATGTCAA AACATACCTA 4800 

CTTCCAGATA ACCACAAAAC ATCCAAACGT AAAACCAAAA TTTCACGAAA AACGAGGAAT 4860 

CCGACATTCA ATGAAATGCT TGTATACAGT GGATATAGCA AAGAAACCCT AAGACAGCGA 4920 

45 GAACTTCAAC TAAGTGTACT CAGTGCAGAA TCTCTGCGGG AGAATTTTTT CTTGGGTGGA 4980 

GTAACCCTGC CTTTGAAAGA TTTCAACTTG AG CAAAGAGA CGGTTAAATG GTATCAGCTG 5040 
ACTGCGGCAA CATACTTGTA A 

Seq ID NO: 30 Protein sequence 
50 Protein Accession #: NP_002636.1 

1 11 21 31 41 51 

I I I I I I 

MAQIFSNSGF KECPFSHPEP TRAKDVDKEE ALQMEAEALA KLQKDRQVTD NQRGFELSSS 60 

55 TRKKAQVYNK QDYDLMVFPE SDSQKRALDI DVEKLTQAEIi EKLLLDDSFE TKKTPVLPVT 120 

PILSPSPSAQ LYFRPTIQRG QWPPGLPGPS TYALPS1YPS TYSKQAAFQN GFNPRMPTFP 180 

STEPIYLSLP GQSPYFSYPL TPATP FHPQG SLPIYRPWS TDMAKLFDKI ASTSEFLKNG 240 

KARTDLEITD SKVSNLQVSP KSEDISKFDW LDLDPLSKPK VDNVEVLDHE EEKNVSSLLA 300 

KDPWDAVLLE ERSTANCHLE RKVNGKSLSV ATVTRSQSLN IRTTQLAKAQ GHISQKDPNG 360 

60 TSSLPTGSSL LQEVEVQNEE MAAFCRSITK LKTKFPYTNH RTNPGYLLSP VTAQRNICGE 420 

NASVKVSIDI EGFQIiPVTFT CDVSSTVEII IMQALCWVHD DLNQVDVGSY VLKVCGQEEV 480 

LQNNHCLGSH EHIQNCRKWD TEIRLQLLTF SAMCQNLART AEDDETPVDL NKHLYQIEKP 540 

CKEAMTRHPV EELLDSYHNQ VELALQIENQ HRAVDQVIKA VRKICSALDG VETLAITESV 600 

KKLKRAVNLP RSKTADVTSL FGGEDTSRSS TRGSLNPENP VQVSINQLTA AIYDLLRLHA 660 

65 NSGRSPTDCA QSSKSVKEAW TTTEQLQFTI FAAHGISSNW VSNYEKYYLI CSLSHNGKDL 720 

FKPIQSKKVG TYKNFFYLIK WDELIIFPIQ ISQLPLESVti HLTliFGILNQ SSGSSPDSNK 780 

QRKGPEALGK VSIiPLCDFRR FLTCGTKLLY LWTSSHTNSV PGTVTKKGYV MERIVLQVDF 840 

PSPAFDIIYT TPOVDRSIIQ QHNLETLEND I KGKLLD I LH KDSSUGLSKE DKAFLWEKRY 900 

YCFKHPNCLP KILASAPNWK WGNLAKTYSL LHQWPALYPL IAX.ELLDSKF ADQEVRSLAV 960 

70 TWIEAISDDE LTDLLPQFVQ AlaKYEIYLNS SLVQFLLSRA LGNIQIAHNIt YWLLKDALHD 1020 

VQFSTRYEHV LGALLSVGGK RLREELLKQT KLVQLLGGVA EKVRQASGSA RQWLQRSME 1080 

RVQSFFQKNK CRLPLKPSLV AKELNIKSCS FFSSNAVPIiK VTMVNADPLG EEINVMFKVG 1140 

EDLRQDMLAIi QMIKXMDKIW ItKEGLDLRMV IFKCLSTGRD RGMVELVPAS DTLRKIQVEY 1200 

GVTGSFKDKP LAEWLRKYNP SEEEYEXASE NFIYSCAGCC VATYVLGICD RHNDNIMLRS 1260 

75 TGHMFHIDFG KFLGHAQMFG SFKRDRAPFV L.TSDMAYVIN GGEKPT1RFQ LFVDLCCQAY 1320 

NLiIRKOTNIjF LNLLSLMIPS GIiPELTSIQD LKYVRDAIiQP QTTDAEATIF FTRLIESSIjG 1380 

SIATKFNFFI HNLAQLRFSG LPSNDEPILS FSPKTYSFRQ DGRIKEVSVF TYHKKYHPDK 1440 

HYIYWRILW EGQ1EPSFVF RTFVEFQELH NKLSIIFPLW KLPGFPNRMV LGRTHIKDVA 1500 

AKRKIELNSY LQSLMNASTD VAECDLVCTF FHPLLRDEKA EG I ARSADAG SFSPTPGQIG 1560 

80 GAVKLSISYR NGTLFIMVMH IKDLVTEDGA DPNPYVKTYL LPDMHKTSKR KTKISRKTRN 1620 

PTFNEMLVYS GYSKETLRQR ELOLSVbSAE SLRENFFLGG VTLPLKDFNL SKETVKWYQL 1680 
TAATYL 
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Seq ID NO: 31 DNA sequence 

Nucleic Acid Accession #: CAT cluster 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



i 
I 

TTTTTTTTAG 
ACCTCAAAAT 
TTGGTATTTT 
TATGAAAAAA 
GTGATTTGTT 
AGATGAGCAC 
ATTTATTTTT 



11 
I 

AGACTAAACC 
ACATT CTGGA 
CACTGTCAAT 
AGCTACCTCA 
AAGCACTCAC 
TGACTTTCCC 
CTTGTATGCA 
TTTGTTTTTG 



21 
I 

ATAGCAAGGA 
ATTTGTAAGG 
TATGCCTCGT 
TAGAGCTCAT 
ATCAATAAAA 
CATTGAGGAG 
TAGCTGGGTT 
TAGGTCCTAT 



31 
I 

GTTTGTGATC 
GATGCTTTCG 
ATTATTTATT 
GACACATAAT 
TATTTCAGCT 
TCTCGATTAC 
CAAGAGTTCT 
AATACAGTAA 



41 
I 

ACTGTATAGC 
TCGACTTTTT 
TATTTGCCAA 
AGGTATTCAC 
CAACAGGCAC 
CTCATGTCTC 
TTCTTGTTTT 
AGACATCAAA 



Seq ID NO: 32 DNA sequence 

Nucleic Acid Accession #: CAT cluster 



1 
I 

CAGTCAGATT 
CTTTCTTGAT 
GATTTCAGTT 
GCCAAGAAAA 
ACTTCCTTTG 
TGTCTTATAT 
TGGCAATAGT 
GGGTTGCATT 
TTTTTCAAGA 



11 
I 

TTTTTTTTGC 
TCCTCTTTTT 
TCAGGTGATT 
ATTTTAAACT 
TTGACTTTCT 
GTACAGAATC 
TTTCATAAGA 
TTGGGATATG 
TCTGAATATT 



21 
I 

TTAACTAAGA 
GGAGCAGTCC 
TCGATAGAAT 
TTTTTTTTTG 
TTGCCATTAA 
CTTTCCAGCT 
GGTTTTTTAA 
CTACATTTCA 
CTGATTTACA 



31 
I 

CAAAGTGAAT 
ATCTTTATGG 
TGTATTTGGC 
TAATCATATT 
TTTAAAAGTT 
GTAAGTCATC 
AACAGAAAAA 
AAGGTATCTT 
GAAATTATAA 



Seq ID NO: 33 DNA sequence 

Nucleic Acid Accession #: AK02 6418.1 



TTTTAAGATG 
TCACTGCAAC 
TGGGATTACA 
AGAATTCACC 
TATGGGCGTG 
TTGTTATTAT 
CAGAGGCTGA 
AAGTACAGAC 
TTAAAAGAGT 
AGAGACTAGA 
CTTAAATGAC 
ATGTTAGATT 
TGATTGGCTT 
AGGTATTTGG 
TTTTGTGGAA 
AACAACAGTG 
TCAGCGATTT 
CATTTCACAC 
ACTCAGTTAT 
TGAAGTTGGC 
AGTGTTAGAG 
CGGTCACTTT 
GAGAAGCTCA 
TACATTAAAA 
ATGAGAAATT 
AGCAAACCCC 
TAGGTAACAC 
TGGACACAGT 
AAATACTCTG 
TAGTAAATTC 
TTGATTATTG 
AGAGACTCGT 
ATAAGCTGAT 
AAAAAAAAAA 



11 
I 

GAGTTTTCGC 
CTCTGCCCCC 
GGCACACACC 
ATATTGGCCT 
AGCCAGCCAC 
CCAAGAATTG 
TTAATATAAC 
AAAAGTGGAA 
GGTGTCACAT 
ACTCCAACTG 
TAGTAATATT 
TGTAGCCAAA 
TTTAGAACGT 
GGTTTTTCAG 
AACACTTTGG 
TAATAGAAAT 
TTATAGAAGT 
CCCTAAATTT 
ACTGAATTCA 
AGAGCCCTCT 
ATGGATAATG 
GACTGCAGTA 
TGGTGGGGGA 
AATTAATAAG 
TAGTGAAAGA 
TTTTTTAAGG 
ATGATTGGAG 
AATGCATATA 
TAGCTTTTGT 
TAGGTTTCTT 
GTATCACATT 
AGATATTGAC 
GCAGTCAT CA 



21 
I 

TCTTGTTGCC 
CGGGTTCAGG 
ACCACGGCCA 
CAGGTAATCC 



TTGATAGAGT 
TAGTTTACAT 
AACAAACCAG 
TAAAAGAAAA 
CTAGCCAACT 
CCTACATTAT 
TATGTCTAGG 
TATATATTAG 
ACTTACTTGA 
CAAACTCTGA 
GGAAATTACT 
TGCTTTATGA 
TCTACATGAG 
TTTATGATGA 
CTGGTACCTG 
TTTCTGTGTA 
GAGCTTCTTA 
AACCTGGAAT 
AGTATCTATT 
TTTAAAATCA 
CAATGTCAGT 
AGATTGAAGA 
TTTAAAATGG 
GATTCAGGGA 
GTGTGGAACT 
TATTAGTCTG 
TGGGAGACCC 
TTTCACATTA 



31 
I 

CAGGCTGGAG 
CGATTCTCTC 
GCTAATTATT 
GCCCGCCTCG 
CCGTTTTATT 
ATATACTGTA 
TTGTTAGCCT 

AAGTCACAGA 
GCCTAGAATA 
GTGATGGCAT 
AAATGCTTAA 
TGTGCTTTAT 
TTACAGATCT 
GTCTTAGTCA 
GATTCACATT 
CAAAGAAAGC 
GATTTATTTC 
GCGCTCTCAA 
ATTAGAAGTC 
GCAGAAGTAG 
GTGAGCAGTC 
TTATCTAAAT 
TGGTGAAATC 
TTTTTCAGAC 
TATTAAGCTT 
GTGAAGTCCC 
TTCATGTTAA 
AATGAGTGGA 
CAGTGGGCAA 
TATGTATCTG 
AAGCTGAATG 
AAATGTACCA 



Seq ID NO: 34 DNA sequence 

Nucleic Acid Accession #: CAT cluster 



CTACTACTAA 
GCACTGTTCA 
ATGTGGGCGT 
CAATTTGGTT 
AATTCTAAAT 
TATTAAGTAA 
ACTGACAGCA 
AATAGCAAGA 
ACTATATATA 



11 
I 

ATTCGCGGCC 
GCTCTTTTAG 
GGGTGTTGAC 
TGAAATAGCT 
TTATCTTCAC 
AG CAAATGCT 
TATTCTATGA 
AAAATATGAA 
TACACACACA 



21 

i 

GCGTCGACTT 
GCACTGCAAA 
CTACATCTGA 
ATAATTAAGT 
ATACACCCTA 
GAACTAAATG 
ATGATTACGT 
ATGATGGTAG 
CATGCACACA 



31 

I 

TTTTTTTTTT 
GTTGTCTTGA 
ACAATTTACA 
TATTATCAGA 
ACTGAGAAAA 
CCTCCATGTT 
TAGTCGTTTC 
ACAAAAAAGA 
GAATTGCCTT 



41 
I 

AATTCACTGT 
GAAAACCAGC 
TCAGAAATGA 
ACTAGTTTGA 
CCAGTATCCT 
AGCAAGTAAA 
TGTTGACATT 
TTAAATCTGA 
AAAAAAAAGT 



41 
t 

TGCAGTGGTG 
CTCTCAGTCT 
TTGTATTTTG 
GCCTCCCAAA 
GTTTGAAAAA 
TTTGAAGTGT 
TTCACATCTG 
ATTGTGAAGC 
AATAAGTCAG 
TAGTAAATAT 
TTCCCAAACT 
ACAATATAAA 
GCATATCCAA 
GGAGTATCTC 
TTAAAAATAG 
GAGCCATGAA 
TTTGGTTAAC 
TCTGGTTCTC 
CCATTCTTAT 
CGTCTTCCGT 
TCATTATGTC 
TGTGATGGAG 
ATTTCATTTC 
ATTTTCCTCC 
TTTTTCCACA 
TAGGGAACCA 
TGCTTTAAAG 
GAGTAGGTAT 
GCCTCACAGG 
AATCTTAACT 
TGTCATOGAT 
CTAAAATCTG 
CAGCTATATA 



41 
i 

TTGTCTTATG 
ATTAGGAAAG 
TATGATTCAC 
GAAGTATTTA 
GGGCCACATT 
AACATTTATA 
TTTAAAAATT 
GTTTCAGTTT 
CCCGGATGTA 



51 
I 

GCTGAGTGAA 
TTTTTTTTTT 
AATACGACTG 
TGAGCATTTG 
ACTAGGGGCC 
ACTTCAAACA 
GTCGGATATA 
TAGACC 



51 
I 

GAGCCAAATT 
CTAGAATGGT 
TAAGACTGGG 
TTTCATATGA 
CAATATTTGA 
AAATTTAGTA 
GCCAGCCTCT 
AGGCAAAGAC 
CGACGCG 



51 
i 

CAATCTTGGC 
CTCAAGTAAC 
AGTCGAGAGG 
GTGTTGGGAT 
CAAGTACAGG 
AGAACTGAGG 
TGAAGGAATA 
ACAGAGCTGC 
TATTTTGTTT 
TTTCTAGTTT 
GTTTAATTAG 
ACAGTTTTAA 
GAGGTGAGTG 
AAAACAGTTG 
TTTTTGGGTA 
GAATTTATTT 
TGGCATTTGG 
TCACTTTCTC 
TCATCAAAGC 
CTCATAGGGA 
CCCTTAAATT 
TATACTTTCG 
TTTGATAAAT 
ACGTGACCAA 
TTAGTTGGGA 
CATGCCACTT 
TGTACTCCTG 
ATTTCTATCT 
CACAAGAATC 
GAGTGAATTC 
CTCCTTAAGA 
CTCCATGGAT 
TGCCGCAAAA 



51 
I 

TCTCTAATCT 
AGGTGCTAGA 
CACAATTAAA 
CTAGTCTAGA 
TTCTGCACTC 
TTGTTAAGTT 
ATAGGTTTGA 
CTAACTTCTA 
TAGAAATTAT 



60 
120 
180 
240 
300 
360 
420 



60 
120 
180 
240 
300 
360 
420 
480 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 



60 
120 
180 
240 
300 
360 
420 
480 
540 
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ATACAGCCAT GTCCAGGCNC GATGGAAATT ATGGGGGAAT ATCCAANTTA GGATACNCGT 600 
GCCGAATCGC CGGGTNTAAA TAATACNGGT TTATAATGGA CNATCCACAA TCCTGGTTTA 

Seq ID NO: 3S DNA sequence 
5 Nucleic Acid Accession #: NM_018490.1 
Coding sequence: 445.. 3300 

1 11 21 31 41 51 

in I 1 1 1 1 1 

1U CCGCGGCTGG GAGACAGCGA GCCAGAGTCT GGGTGTTTGT GCGAGAGCCA CGGCGGGGGC 60 

TGGGGCGAGT GGCCGGCATG GCTGAAGGCT GCGCTCTGCA ACCTTGAAGA GCCGCTGCAT 120 

TGAGAGGCCA GGGACAGGGA GACCGGTGCG ATGGCAGAGC GCGGCCCCCG CCGCTGCGCC 180 

GGGCCGGCCC GGCTGGCCTG AGCCGCCGGA GGAGCGGGGC TGCCTCTGCG CGTCCATGGA 240 

GCAGCGGGAA GGGCGAAACT CCGGAGCGCC GCGTCCCTGC GCCGCTGCGG CGGACTGCTG 300 

15 AAGGGGCCGA GCCCGCGCGG ACCGCCGAGG AAGAGACCCC CGCTCCAGCC CGCAGGCCGG 360 

CTGCCCGGGG GCGGCGGGGG ACATCGGAGG GCAGCGGAGC GAGCAGCGCC GCGGGAGAGG 420 

CCGGCGCGGG AGGCGGCCGC AGCAATGCCG GGCCCGCTAG GGCTGCTCTG CTTCCTCGCC 480 

CTGGGGCTGC TCGGCTCGGC CGGGCCCAGC GGCGCGGCGC CGCCTCTCTG CGCGGCGCCC 540 

TGCAGCTGCG ACGGCGACCG TCGGGTGGAC TGCTCCGGGA AGGGGCTGAC GGCCGTGCCC 600 

20 GAGGGGCTCA GCGCCTTCAC CCAAGCGCTG GATATCAGTA TGAACAACAT TACTCAGTTG 660 

CCAGAAGATG CATTTAAGAA CTTTCCTTTT CTAGAAGAGC TACAATTGGC GGGCAACGAC 720 

CTTTCTTTTA TCCACCCAAA GGCCTTGTCT GGGTTGAAAG AACTCAAAGT TCTAACGCTC 780 

CAGAATAATC AGTTGAAAAC AGTACCCAGT GAAGCCATTC GAGGGCTGAG TGCTTTGCAG 840 

TCTTTGOGTT TAGATGCCAA CCATATTACC TCAGTCCCCG AGGACAGTTT TGAAGGACTT 900 

25 GTTCAGTTAC GGCATCTGTG GCTGGATGAC AACAGCTTGA CGGAGGTGCC TGTGCACCCC 960 

CTCAGCAATC TGCCCACCCT ACAGGCGCTG ACCCTGGCTC TCAACAAGAT CTCAAGCATC 102 0 

CCTGACTTTG CATTTACCAA CCTTTCAAGC CTGGTAGTTC TGCATCTTCA TAACAATAAA 1080 

ATTAGAGGCC TGAGTCAACA CTGTTTTGAT GGACTAGATA ACCTGGAGAC CTTAGACTTG 1140 

AGTTATAATA ACTTGGGGGA ATTTCCTCAG GCTATTAAAG CCCGTCCTAG CCTTAAAGAG 1200 

30 CTAGGATTTC ATAGTAATTC TATTTCTGTT ATCCCTGATG GAGCATTTGA TGGTAATCCA 1260 

CTCTTAAGAA CTATACATTT GTATGATAAT CCTCTGTCTT TTGTGGGGAA CTCAGCATCT 1320 

CACAATTTAT CTGATCTTCA TTCCCTAGTC ATTCGTGGTG CAAGCATGGT GCAGCAGTTC 1380 

CCCAATCTTA CAGGAACTGT CCACCTGGAA AGT CTGACTT TGACAGGTAC AAAGATAAGC 1440 

AGCATACCTA ATAATTTGTG TCAAGAACAA AAGATGCTTA GGACTTTGGA CTTGTCTTAC 1500 

35 AATAATATAA GAGACCTTCC AAGTTTTAAT GGTTGCCATG CTCTGGAAGA AATTTCTTTA 1560 

CAGCGTAATC AAATCTACCA AATAAAGGAA GGCACCTTTC AAGGCCTGAT ATCTCTAAGG 1620 

ATTCTAGATC TGAGTAGAAA CCTGATACAT GAAATTCACA GTAGAGCTTT TGCCACACTT 1680 

GGGCCAATAA CTAACCTAGA TGTAAGTTTC AATGAATTAA CTTCCTTTCC TACGGAAGGC 1740 

CCGAATGGGC TAAATCAACT GAAACTTGTG GGCAACTTCA AGCTGAAAGA AGCCTTAGCA 1800 

40 GCAAAAGACT TTGTTAACCT CAGGTCTTTA TCGGTACCAT ATGCTTATCA GTGCTGTGCA 1860 

TTTTGGGGTT GTGACTCT T A TGCAAATTTA AACACAGAAG ATAACAGCCT CCAGGACCAC 192 0 

AGTGTGGCAC AGGAGAAAGG TACTGCTGAT GCAGCAAATG TCACAAGCAC TCTTGAAAAT 1980 

GAAGAACATA GTCAAATAAT TATCCATTGT ACACCTTCAA CAGGTGCTTT TAAGCCCTGT 2040 

GAATATTTAC TGGGAAGCTG GATGATTCGT CTTACTGTGT GGTTCATTTT CTTGGTTGCA 2100 

45 TTATTTTTCA ACCTGCTTGT TATTTTAACA ACATTTGCAT CTTGTACATC ACTGCCTTCG 2160 

TCCAAATTGT TT AT AGG CTT GATTTCTGTG TCTAACTTAT TCATGGGAAT CTATACTGGC 2220 

ATCCTAACTT TTCTTGATGC TGTGTCCTGG GGCAGATTCG CTGAATTTGG CATTTGGTGG 2280 

GAAACTGGCA GTGGCTGCAA AGTAGCTGGG TTTCTTGCAG TTTTCTCCTC AGAAAGTGCC 2340 

ATATTTTTAT TAATGCTAGC AACTGTCGAA AGAAGCTTAT CTGCAAAAGA TATAATGAAA 2400 

50 AATGGGAAGA GCAATCATCT CAAACAGTTC CGGGTTGCTG CCCTTTCGGC TTTCCTAGGT 2460 

GCTACAGTAG CAGGCTGTTT TCCCCTTTTC CATAGAGGGG AATATTCTGC ATCACCCCTT 2520 

TGTTTGCCAT TTCCTACAGG TGAAACGCCA TCATTAGGAT TCACTGTAAC GTTAGTGCTA 2580 

TTAAACTCAC TAGCATTTTT ATTAATGGCC GTT AT CTACA CTAAGCTATA CTGCAACTTG 2640 

GAAAAAGAGG ACCTCTCAGA AAACTCACAA TCTAGCATGA TTAAG CATGT CGCTTGGCTA 2700 

55 ATCTTCACCA ATTGCATCTT TTTCTGCCCT GTGGCGTTTT TTTCATTTGC ACCATTGATC 2760 

ACTGCAATCT CTATCAGCCC CGAAATAATG AAGT CTGTTA CTCTGATATT TTTTCCATTG 2820 

CCTGCTTGCC TGAATCCAGT CCTGTATGTT TTCTTCAACC CAAAGTTTAA AGAAGACTGG 2880 

AAGTTACTGA AGCGACGTGT TACCAAGAAA AGTGGATCAG TTTCAGTTTC CATCAGTAGC 2940 

CAAGGTGGTT GTCTGGAACA GGATTTCTAC TACGACTGTG GCATGTACTC ACATTTGCAG 3000 

60 GGCAACCTGA CTGTTTGCGA CTGCTGCGAA TCGTTTCTTT TAACAAAGCC AGTATCATGC 3060 

AAACACTTGA TAAAATCACA CAGCTGTCCT GCATTGGCAG TGGCTTCTTG CCAAAGACCT 3120 

GAGGGCTACT GGTCCGACTG TGGCACACAG TCGGCCCACT CTGATTATGC AGATGAAGAA 3180 

GATTCCTTTG TCTCAGACAG TTCTGACCAG GTGCAGGCCT GTGGACGAGC CTG CTTCTAC 3240 

CAGAGTAGAG GATTCCCTTT GGTGCGCTAT GCTTACAATC TACCAAGAGT TAAAGACTGA 3300 

65 ACTACTGTGT GTGTAACCGT TTCCCCCGTC AACCAAAATC AGTGTTTATA GAGTGAACCC 3360 

TATTCTCATC TTTCATCTGG GAAGCACTTC TGTAATCACT GCCTGGTGTC ACTTAGAAGA 3420 

AGGAGAGGTG GCAGTTTATT TCTCAAACCA GTCATTTTCA AAGAACAGGT GCCTAAATTA 3480 

TAAATTGGTG AAAAATGCAA TGTCCAAGCA ATGTATGATC TGTTTGAAAC AAATATATGA 3540 

CTTGAAAAGG ATCTTAGGTG T AGT AG AG CA ATATAATGTT AGTTTTTTCT GATCCATAAG 3600 

70 AAGCAAATTT ATACCTATTT GTGTATTAAG CACAAGATAA AGAACAGCTG TTAATATTTT 3660 

TTAAAAATCT ATTTTAAAAT GTGATTTTCT ATAACTGAAG AAAATATCTT GCTAATTTTA 3720 

CCTAATGTTT CATCCTTAAT CTCAGGACAA CTTACTGCAG GGCCAAAAAA GGGACTGTCC 3780 

CAGCTAGAAC TGTGAGAGTA TACATAGGCA TTACTTTATT ATGTTTTCAC TTGCCATCCT 3840 

TGACATAAGA GAACTATAAA TTTTGTTTAA GCAATTTATA AATCTAAAAC CTGAAGATGT 3900 

75 TTTTAAAACA ATATTAACAG CTGTTAGGTT AAAAAAATAG CTGGACATTT GTTTTCAGTC 3960 

ATTATACATT GCTTTGGTCC AATCAGTAAT TTTTTCTTAA GTGTTTTGTG ATTACACTAC 4020 

TAGAAAAAAA GTAAAAGGCT AATTGCTGTG TGGGTTTAGT CGATTTGGCT AAACTACTAA 4080 

CTAATGTGGG GGTTTAATAG TAT CTGAGGG ATTTGGTGGC TTCATGTAAT GTTCTCATTA 4140 

ATGAATACTT CCTAATATCG TTGGCTCTAC TAATATTTTC CAATTTGCTG GGATGTCACC 4200 

80 TAGCAATAGC TTGGATTATA TAGAAAGTAA ACTGTGGTCA ATACTTGCAT TTAATTAGAC 4260 

GAAACGGGGA GTAATTATGA CACGAAGTAC TTATGTTTAT TT CTT AGTGA GCTGGATTAT 4320 

CTTGAACCTG TGCTATTAAA TGGAAATTTC CATACATCTT CCCCATACTA TTTTTTATAA 4380 

AAGAGCCTAT TCAATAGCTC AGAGGTTGAA CTCTGGTTAA ACAAGATAAT ATGTTATTAA 4440 
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TAAAAATAGA AGAAGAAAGA ATAAAGCTTA GTCCTGTGTC TTTAAAAATT AAAAATTTTA 4500 

CTTGATTCCC ATCTATGGGC TTTAGACCTA TTACTGGGTG GAGTCTTAAA GTTATAATTG 4560 

TTCAATATGT TTTTTGAACA GTGTGCTAAA TCAATAGCAA ACCCACTGCC ATATTAGTTA 4620 

TTCTGAATAT ACTAAAAAAA TCCAGCTAGA TTGCAGTTTA ATAATTAAAC TGTACATACT 4680 

5 GTGCATATAA TGAATTTTTA TCTTATGTAA ATTATTTTTA GAACACAAGT TGGGAAATGT 4740 

GGCTTCTGTT CATTTCGTTT AATTAAAGCT ACCTCCTAAA CTATAGTGGC TGCCAGTAGC 4800 

AGACTGTTAA ATTGTGGTTT ATATACTTTT TGCATTGTAA ATAGTCTTTG TTGTACATTG 4860 

TCAGTGTAAT AAAAACAGAA TCTTTGTATA TCAAAATCAT GTAGTTTGTA TAAAATGTGG 4920 

GAAGGATTTA TTTACAGTGT GTTGTAATTT TGTAAGGCCA ACTATTTACA AGTTTTAAAA 4980 

10 ATTGCTATCA TGTATATTTA CACATCTGAT AAATATTAAA TCATAACTTG GTAAGAAACT 5040 

CCTAATTAAA AGGTTTTTTC CAAAATTCAG GTTATTGAAA ATTTTTCATT TTATTCATTT 5100 

AAAAACTAGA ATAACAGATA TATAAAAGTG TTAATCTTTG TGCTATATGG TATGAAATAC 5160 
AATATTGTAC TCAGTGTTTT GAATTATTAA AGTTTCTAGA AAGCAAAAAA A 

15 Seq ID NO: 36 Protein sequence 

Protein Accession #: NP_060960.1 

1 11 21 31 41 51 

on I I I I I I 

ZU MPGPLGLLCF LALGLLGSAG PSGAAPPLCA APCSCDGDRR VDCSGKGLTA VPEGLSAFTQ 60 

ALDISMNNIT QLPEDAFKNF PFLEELQLAG NDLSFIHPKA LSGLKKLKVL TLQNNQLKTV 120 

PSEAIRGLSA LQSLRLDANH ITSVPEDSFE GLVQLRHLWL DDNSLTEVPV HPLSNLPTLQ 180 

ALTLALNKIS SIPDFAFTNL SSLWLHLHN NKIRGLSQHC FDGLDNLETL DLSYNNLGEF 240 

PQAIKARPSL KELGFHSNSI SVIPDGAFDG NPLIiRTIHLY DNPLSFVGNS ASHNLSDLHS 300 

25 LVIRGASMVQ QFPNLTGTVH LESLTLTGTK ISSIPNNLCQ EQKMLRTLDL SYNNIRDLPS 360 

FNGCHALEEI SLQRNQIYQI KEGTFQGIiIS LRILDLSRNIi IHEIHSRAFA TIiGPITNLDV 420 

SFNELTSFPT EGPNGLNQIiK LVGNFKLKEA LAAKDFVNLR SLSVPYAYQC CAFWGCDSYA 480 

NLNTEDNSLQ DHSVAQEKGT ADAANVTSTL ENEEHSQIII HCTPSTGAFK PCEYLLGSWM 540 

IRLTVWFIFL VALFFNLLVI LTTFASCTSL PSSKLFIGLI SVSNLFMGIY TG I LTFLDAV 60 0 

30 SWGRFAEFGI WWETGSGCKV AGFLAVFSSE SAI FI»LMLAT VERSLSAKDI MKNGKSNHXiK 660 

QFRVAALSAF LGATVAGCFP LFHRGEYSAS PLCLPFPTGE TPSLGFTVTL VLLNSLAFLL 720 

MAVIYTKLYC NLEKEDLSEN SQSSMIKHVA WLIFTNCIFF CPVAFFSFAP IiITAISISPE 780 

IMKSVTLIFF PLPACLNPVL YVFFNPKFKE DWKLLKRRVT KKSGSVSVSI SSQGGCLEQD 840 

FYYDOGMYSH LQGNLTVCDC CESFLLTKPV SCKHLIKSHS CPALAVASCQ RPEGYWSDCG 900 

35 TQSAHSDYAD EEDSFVSDSS DQVQACGRAC FYQSRGFPIiV RYAYNLPRVK D 

Seq ID NO: 37 DNA sequence 

Nucleic Acid Accession #: AF144648.1 

Coding sequence : 1 . . 1884 

1 11 21 31 41 51 

I I I I 1 I 

ATGCTGCGAG CCGCAGTGAT CCTGCTGCTC ATCAGGACCT GGCTCGCGGA GGGCAACTAC 60 

CCCAGTCCCA TCCCGAAATT CCACTTCGAG TTCTCCTCTG CTGTGCCCX3A AGTCGTCCTG 120 

45 AACCTCTTCA ACTGCAAAAA TTGTGCAAAT GAAGCTGTGG TTCAAAAGAT TTTGGACAGG 180 

GTGCTGTCAA GATACGATGT CCGCCTGAGA CCGAATTTTG GAGGTGCCCC TGTGCCTGTG 240 

AGAATATCTA TTTATGTCAC GAGCATTGAA CAGATCTCAG AAATGAATAT GGACTACACG 300 

ATCACGATGT TTTTTCATCA GACTTGGAAA GATTCACGCT TAGCATACTA TGAGACCACC 360 

CTGAACTTGA CCCTGGACTA TCGGATGCAT GAGAAGTTGT GGGTCCCTGA CTGCTACTTT 420 

50 TTGAACAGCA AGGATGCTTT CGTGCATGAT GTGACTGTGG AGAATCGCGT GTTTCAGCTT 480 

CACCCAGATG GAACGGTGCG GTACGGCATC CGACTCACCA CTACAGCAGC TTGTTCCCTG 540 

GATCTGCATA AATTCCCTAT GGACAAGCAG GCCTGCAACC TGGTGGTAGA GAGCTATGGT 600 

TACACGGTTG AAGACATCAT ATTATTCTGG GATGACAATG GGAACGCCAT CCACATGACT 660 

GAGGAGCTGC ATATCCCTCA GTTCACTTTC CTGGGAAGGA CGATTACTAG CAAGGAGGTG 720 

55 T ATTTCTACA 1 CAGGTTCCTA CATACGCCTG ATACTGAAGT TCCAGGTTCA GAGGGAAGTT 780 

AACAGCTACC TTGTGCAAGT CTACTGGCCT ACTGTCCTCA CCACTATTAC CTCTTGGATA 840 

TCGTTTTGGA TGAACTATGA TTCCTCTGCA GCCAGGGTGA CAATTGGCTT AACTTCAATG 900 

CTCATCCTGA CCACCATCGA CTCACATCTG CGGGATAAGC TCCCCAACAT TTCCTGTATC 960 

AAGGCCATTG ATATCTATAT CCTCGTGTGC TTGTTCTTTG TGTTCCTGTC CTTGCTGGAG 1020 

60 TATGTCTACA TCAACTATCT TTTCTACAGT CGAGGACCTC GGCGCCAGCC TAGGCGACAC 1080 

AGGAGACCCC GAAGAGTCAT TGCCCGCTAC CGCTACCAGC AAGTGGTGGT AGGAAACGTG 1140 

CAGGATGGCC TGATTAACGT GGAAGACGGA GTCAGCTCTC TCCCCATCAC CCCAGCGCAG 1200 

GCCCCCCTGG CAAGCCCGGA AAGCCTCGGT TCTTTGACGT CCACCTCCGA GCAGGCCCAG 1260 

CTGGCCACCT CGGAAAGCCT CAGCCCACTC ACTTCTCTCT CAGGCCAGGC CCCCCTGGCC 1320 

65 ACTGGAGAAA GCCTGAGCGA TCTCCCCTCC ACCTCAGAGC AGGCCCGGCA CAGCTATGGT 13 80 

GTTCGCTTTA ATGGTTTCCA GGCTGATGAC AGTATTTTTC CTACCGAAAT CCGCAACCGT 1440 

GTCGAAGCCC ATGGCCATGG TGTTACCCAT GACCATGAAG ATTCCAATGA GAGCTTGAGC 1500 

TCGGATGAGC GCCATGGCCA TGGCCCCAGT GGGAAGCCCA TGCTTCACCA TGGCGAGAAG 1560 

GGTGTGCAAG AAGCAGGCTG GGACCTTGAT GACAACAATG ACAAGAGCGA CTGCCTTGCC 1620 

70 ATTAAGGAGC AATTCAAGTG TGATACTAAC AGTACCTGGG GCCTTAATGA TGATGAGCTC 1680 

ATGGCCCATG GCCAAGAGAA GGACAGTAGC TCAGAGTCTG AGGATAGTTG CCCCCCAAGC 1740 

CCTGGGTGCT CCTTCACTGA AGGGTTCTCC TTCGATCTCT TTAATCCTGA CTACGTCCCA 1800 

AAGGTCGACA AGTGGTCCCG GTTCCTCTTC CCTCTGGCCT TTGGGTTGTT CAACATTGTT 1860 
TACTGGGTAT ACCATATGTA TTAG 



75 



Seq ID NO: 38 Protein sequence 
Protein Accession #: AAD51172.1 



„ 1 11 21 31 41 51 

80 | ! | | | | 

MLRAAVILLL IRTWLAEGNY PSPIPKFHFE FSSAVPEWL NLFNCKNCAN EAWQKILDR 60 
VLSRYDVRLR PNFGGAPVPV RISIYVTSIE QISEMNMDYT ITMFFHQTWK DSRLAYYETT 120 
LNLTLDYRMH EKLWVPDCYF LNSKDAFVHD VTVENRVFQL HPDGTVRYGI RLTTTAACSL 180 
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DLHKFPMDKQ ACNLWESYG YTVEDIILFW DDNGNAIHMT EELHIPQFTP LGRTITSKEV 240 

YFYTGSYIRL ILKFQVQREV NSYLVQVYWP TVLTTITSWI SFWMNYDSSA ARVTIGLTSM 300 

LILTTIDSHL RDKLPNISCI KAIDIYILVC LFFVFLSLUE YVYINYLFYS RGPRRQPRRH 360 

RRPRRVIARY RYQQWVGNV QDGLINVEDG VSSLPITPAQ AP1ASPESLG SLTSTSEQAQ 420 

5 LATSESLSPI* TSLSGQAPliA TGESLSDLPS TSEQARHSYG VRFNGFQADD SIFPTEIRNR 480 

VEAHGHGVTH DHEDSNESbS SDERHGHGPS GKPMLHHGEK GVQEAGWDLD DNNDKSDCLA 540 

IKEQFKCDTN STWGLNDDEL MAHGQEKDSS SESEDSCPPS PGCSFTEGFS FDLPNPDYVP 600 
KVDKWSRFLF PLAFGL FN I V YWVYHMY 

10 Seq ID NO: 3 9 DNA sequence 

Nucleic Acid Accession #: U47334.1 
Coding sequence: 1..331 

i 

1 11 21 31 41 51 

15 | | | | | | 

CAAAAATTGT GCAAATGAAG CTGTGGTTCA AAAGATTTTG GACAGGGTGC TGTCAAGATA 60 
CGATGTCCGC CTGAGACCGA ATTTTGGANN NATGCTTGCT ACTAACAGTA CCCGGGGCCT 120 
TAATGAAGAT GAGCTCATGG CCCATGGCCA AGAGAAGGAC AGTAGCTCAG AGTCTGAGGA 180 
TAGTTGCCCC CCAAGCCCTG GGTGCTCCTT CACTGAAGGG TTCTCCTTCG ATCTCCTTAA 240 
20 TCCTGACTAC GTCCCAAAGG TCGACAAGTG GTCCCGGTTC CTCTTCCCTC TGGCCTTTGG 300 

GTTGTTCAAC ATTGTAGCGG CCGAACGATG C 



25 



30 



35 



Seq ID NO: 40 Protein sequence 
Protein Accession #: AAC50559.1 

1 11 21 31 41 51 

I I I I I I 

KNCANEAWQ KILDRVLSRY DVRIiRPNFGX MLATNSTRGL NEDELMAHGQ EKDSSSESED 60 

SCPPSPGCSF TEGFSFDLLN PDYVPKVDKW SRFLFPLAFG LFNIVAAERC 

Seq ID NO: 41 DNA sequence 

Nucleic Acid Accession #: NM_020974 

Coding sequence: 81.. 3080 



1 11 21 31 41 51 

I I I I I I 

GGCGTCCGCG CACACCTCCC CGCGCCGCCG CCGCCACCGC CCGCACTCCG CCGCCTCTGC 60 

CCGCAACCGC TGAGCCATCC ATGGGGGTCG CGGGCCGCAA CCGTCCCGGG GCGGCCTGGG 120 

CGGTGCTGCT GCTGCTGCTG CTGCTGCCGC CACTGCTGCT GCTGGCGGGG GCCGTCCCGC 180 

40 CGGGTCGGGG CCGTGCCGCG GGGCCGCAGG AGGATGTAGA TGAGTGTGCC CAAGGGCTAG 240 

ATGACTGCCA TGCCGACGCC CTGTGTCAGA ACACACCCAC CTCCTACAAG TGCTCCTGCA 300 

AGCCTGGCTA CCAAGGGGAA GGCAGGCAGT GTGAGGACAT CGATGAATGT GGAAATGAGC 360 

TCAATGGAGG CTGTGTCCAT GACTGTTTGA ATATTCCAGG CAATTATCGT TGCACTTGTT 420 

TTGATGGCTT CATGTTGGCT CATGACGGTC ATAATTGTCT TGATGTGGAC GAGTGCCTGG 480 

45 AGAACAATGG CGGCTGCCAG CAT AC CTGTG TCAACGTCAT GGGGAGCTAT GAGTGCTGCT 540 

GCAAGGAGGG GTTTTTCCTG AGTGACAATC AGCACACCTG CATTCACCGC TCGGAAGAGG 600 

GCCTGAGCTG CATGAATAAG GATCACGGCT GTAGTCACAT CTGCAAGGAG GCCCCAAGGG 660 

GCAGCGTCGC CTGTGAGTGC AGGCCTGGTT TTGAGCTGGC CAAGAACCAG AGAGACTGCA 720 

TCTTGACCTG TAACCATGGG AACGGTGGGT GCCAGCACTC CTGTGACGAT ACAGCCGATG 780 

50 GCCCAGAGTG CAGCTGCCAT CCACAGTACA AGATGCACAC AGATGGGAGG AGCTGCCTTG 840 

AGCGAGAGGA CACTGTCCTG GAGGTGACAG AGAGCAACAC CACATCAGTG GTGGATGGGG 900 

ATAAACGGGT GAAACGGCGG CTGCTCATGG AAACGTGTGC TGTCAACAAT GGAGGCTGTG 960 

ACCGCACCTG TAAGGATACT TCGACAGGTG TCCACTGCAG TTGTCCTGTT GGATTCACTC 1020 

TCCAGTTGGA TGGGAAGACA TGTAAAGATA TTGATGAGTG CCAGACCCGC AATGGAGGTT 1080 

55 GTGATCATTT CTGCAAAAAC ATCGTGGGCA GTTTTGACTG CGGCTGCAAG AAAGGATTTA 1140 

AATTATTAAC AGATGAGAAG TCTTGCCAAG ATGTGGATGA GTGCTCTTTG GATAGGACCT 1200 

GTGACCACAG CTGCATCAAC CACCCTGGCA CATTTGCTTG TGCTTGCAAC CGAGGGTACA 1260 

CCCTGTATGG CTTCACCCAC TGTGGAGACA CCAATGAGTG CAGCATCAAC AACGGAGGCT 1320 

GTCAGCAGGT CTGTGTGAAC ACAGTGGGCA GCTATGAATG CCAGTGCCAC CCTGGGTACA 1380 

60 AGCTCCACTG GAATAAAAAA GACTGTGTGG AAGTGAAGGG GCTCCTGCCC ACAAGTGTGT 1440 

CACCCCGTGT GTCCCTGCAC TGCGGTAAGA GTGGTGGAGG AGACGGGTGC TTCCTCAGAT 1500 

GTCACTCTGG CATTCACCTC TCTTCAGATG TCACCACCAT CAGGACAAGT GTAACCTTTA 1560 

AGCTAAATGA AGGCAAGTGT AGTTTGAAAA ATGCTGAGCT GTTTCCCGAG GGTCTGCGAC 1620 

CAGCACTACC AGAGAAGCAC AGCTCAGTAA AAGAGAGCTT CCGCTACGTA AACCTTACAT 1680 

65 GCAGCTCTGG CAAGCAAGTC CCAGGAGCCC CTGGCCGACC AAGCACCCCT AAGGAAATGT 1740 

TTATCACTGT TGAGTTTGAG CTTGAAACTA ACCAAAAGGA GGTGACAGCT TCTTGTGACC 1800 

TGAGCTGCAT CGTAAAGCGA ACCGAGAAGC GGCTCCGTAA AGCCATCCGC ACGCTCAGAA 1860 

AGGCCGTCCA CAGGGAGCAG TTTCACCTCC AGCTCTCAGG CATGAACCTC GACGTGGCTA 1920 

AAAAGCCTCC CAGAACATCT GAACGCCAGG CAGAGTCCTG TGGAGTGGGC CAGGGTCATG 1980 

70 CAGAAAACCA ATGTGTCAGT TGCAGGGCTG GGACCTATTA TGATGGAGCA CGAGAACGCT 2040 

GCATTTTATG TCCAAATGGA ACCTTCCAAA ATGAGGAAGG ACAAATGACT TGTGAACCAT 2100 

GCCCAAGACC AGGAAATTCT GGGGCCCTGA AGACCCCAGA AGCTTGGAAT ATGTCTGAAT 2160 

GTGGAGGTCT GTGTCAACCT GGTGAATATT CTGCAGATGG CTTTGCACCT TGCCAGCTCT 2220 

GTGCCCTGGG CACGTTCCAG CCTGAAGCTG GTCGAACTTC CTGCTTCCCC TGTGGAGGAG 2280 

75 GCCTTGCCAC CAAACATCAG GGAGCTACTT CCTTTCAGGA CTGTGAAACC AGAGTTCAAT 2340 

GTTCACCTGG ACATTTCTAC AACACCACCA CTCACCGATG TATTCGTTGC CCAGTGGGAA 2400 

CATACCAGCC TGAATTTGGA AAAAATAATT GTGTTTCTTG CCCAGGAAAT ACTACGACTG 2460 

ACTTTGATGG CTCCACAAAC ATAACCCAGT GTAAAAACAG AAGATGTGGA GGGGAGCTGG 2520 

GAGATTTCAC TGGGTACATT GAATCCCCAA ACTACCCAGG CAATTACCCA GCCAACACCG 2580 

80 AGTGTACGTG GACCATCAAC CCACCCCCCA AGCGCCGCAT CCTGATCGTG GTCCCTGAGA 2640 

TCTTCCTGCC CATAGAGGAC GACTGTGGGG ACTATCTGGT GATGCGGAAA ACCTCTTCAT 2700 

CCAATTCTGT GACAACATAT GAAACCTGCC AGACCTACGA ACGCCCCATC GCCTTCACCT 27 60 

CCAGGTCAAA GAAGCTGTGG ATTCAGTTCA AGTCCAATGA AGGGAACAGC GCTAGAGGGT 2820 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TCCAGGTCCC 
GAGATGGCAG 
TCAAGGCTCT 
AGTCCCGAGA 
TTTTGAGACC 
GGTTGGTGGG 
CCGTATCAGT 
GAACTTGGTT 
CAGCTTCTCA 
TTGGTCAGCC 
TGTAGTGGAA 
CCGGCCCTCT 
CAAGAGGGGA 
ACTCAGTTTC 
AGTTCTAAGC 
AGCACTTCTG 



ATACGTGACA 
GCTCTATGCA 
GTTTGATGTC 
GATGTTTCCA 
TTACAAATGA 
ACAGAGCTGT 
GACTCATTAG 
TTTCTTTCCC 
CTGCTGTGGG 
TAGGTGAGAC 
AGGAGGCCAC 
CTAAGGGAGC 
GGGAAGGAGA 
TCCACAGCCT 
AGTGCTCGTG 
GAGACAT 



TATGATGAGG 
TCTGAGAACC 
CTGGCCCATC 
AGATCGTT ca 
CTCAGCCCAC 
CTTCCTTCTG 
AGTTCAATTT 
AGCATCGTGG 
CGGATGTCTT 
TCACCTGTCC 
AGAATAAGCT 
CCTCTGCACT 
CCCCTGCAGG 
TCTCCAGCCT 
AAAAAAAAAA 



ACTACCAGGA 
ATCAGGAAAT 
CCCAGAACTA 
TCCGATTGCT 
GTGCCACTCA 
CATGT CAGCA 
TTATAGATAA 
ATGTAGACTG 
GGATAGATCA 
TTCTGGGGTC 
GCTTATTCTG 
CGTGTGCAGG 
CTCCCTCCAC 
GTGTGATACA 
GCAGAAAGAA 



Seq ID NO: 42 Protein sequence 
Protein Accession ft: NP 066025 



1 
I 

MGVAGRNRPG 
LCQNTPTSYK 
HDGHNCLDVD 
DHGCSHICKE 
PQYKMHTDGR 
STGVHCSCPV 
SCQDVDECSL 
TVGSYECQCH 
SSDVTTIRTS 
PGAPGRPSTP 
FHLQLSGMNIi 
TFQNEEGQMT 
PEAGRTSCFP 
KNNCVSCPGN 
PPPKRRILIV 
IQFKSNEGNS 
LAHPQNYFKY 



11 
I 

AAWAVLUiLL 
CSCKPGYQGE 
ECLENNGGCQ 
APRGSVACEC 
SCliEREDTVL 
GFTLQLDGKT 
DRTCDHSCIN 
PGYKLHWNKK 
VTFKLNEGKC 
KEMFITVEFE 
DVAKKPPRTS 
CEPCPRPGNS 
CGGGLATKHQ 
TTTDFDGSTN 
VPEIFIiPIED 
ARGFQVPYVT 
TAQESREMFP 



21 
I 

LliPPLIiItLAG 
GRQCEDIDEC 
HTCVNVMGSY 
RPGFELAKNQ 
EVTESNTTSV 
CKDIDECQTR 
HPGTFACACN 
DCVEVKGLLP 
SLKMAELFPE 
LETNQKEVTA 
ERQAESCGVG 
GAIiKTPEAWN 
GATS FQDCET 
ITQCKNRRCG 
DCGDYLVMRK 
YDEDYQELIE 
RSFIRLLRSK 




RDCILTCNHG 
VDGDKRVKRR 
NGGCDHFCKN 
RGYTLYGFTH 
TSVSPRVSLH 
GLRPAIiPEKH 
SCDLSCIVKR 
QGHAENQCVS 
MSECGGLCQP 
RVQCSPGHFY 
GELGDFTGYI 
TSSSNSVTTY 
DIVRDGRIiYA 
VSRFIiRPYK 



ACTCATTGAA 
ACTTAAGGAT 
TTTCAAGTAC 
ACGTTCCAAA 
ATACAAATGT 
CAGTCGGGTA 
TACAGATATT 
AGAATGGCTT 
CGGGCTGGCT 
TTACTCCTCC 
AAACTTCAGC 
CTCTGACCAG 
CCACCTTGAG 
AGTTTGATCC 
TTAGAAATAA 



41 

I 

GPQEDVDECA 
DCLNIPGNYR 
SDNQHTCIHR 
NGGCQHSCDD 
IiLMETCAVNN 
IVGSFDCGCK 
CGDTNECSIN 
CGKSGGGDGC 
SSVKESFRYV 
TEKRLRKAIR 
CRAGTYYDGA 
GEYSADGFAP 
NTTTHRCIRC 
ESPNYPGNYP 
ETCQTYERPI 
SENHQEILKD 



GACATAGTTC 
AAGAAACTTA 
ACAGCCCAGG 
GTGTCCAGGT 
TCTGCTATAG 
TTGCTGCCTC 
TTGGTAAATT 
TGAGTGGCAT 
GAGCTGGACT 
TCAAGGAGTC 
TTCCTCTAGC 
GCAGAACAGG 
ACCTGGGAGG 
CAGGAACTTG 
ATAAAAACTA 



51 
I 

QGLDDCHADA 
CTCFDGFMLA 



TADGPECSCH 
GGCDRTCKDT 
KGFKIiLTDEK 
NGGCQQVCVN 
FIiRCHSGIHL 
NLTCSSGKQV 
TTiRKAVHREQ 
RERCILCPNG 
CQLCALGTFQ 
PVGTYQPEFG 
ANTECTWTIN 
AFTSRSKKLW 
KKL I KALFOV 



Seq ID NO: 43 DNA sequence 

Nucleic Acid Accession #: CAT cluster 



TTTCTTCATT 
TTTCCATCTC 
TTTGATATGA 



GCTTT TATAC 
CATTTTTTTC 
AAACTTTAAA 
ACAGTGATTC 
TCACTTTCCA 
TTAAAATTGT 
AAATAAAGCT 
TTAAGGGGAT 
TCTTTTATCT 
TCTGGCTGCA 
TTGAGAAAAT 
ATGCCTT 



11 
I 

TTATGCTTTT 
ATTAATTCTC 
CTACTACCTG 
TACTGGCATC 
CTGTGTGATG 
CTAAGAAAAC 
ATACAAAGCC 
AAATAGGTGT 
ACATTGGAAA 
ATAAAAGAGA 
GGTTTTGGAA 
TTATAACAGA 
GCCCAGAACC 
ACAAAAGCAG 
CATAGCATNC 



21 
I 

CTCCCCTTTA 
CTGCAGCAAT 
ACTGTATATA 
CTTTTCCATT 
CTCCTTGCCA 
TGAAAAGAAG 
CAGTGAAATC 
TGANNNNNNN 
GTTATGCATA 
AGAAATTTAA 
GAGCAGTGGC 
AGTACTTGAA 
ACAGCTCCCA 
TCAAATTAAA 
TCCCTTTGGC 



31 

I 

TATATACTGG 
TCATAACTCT 
GTTTCCCTTT 
TTACTCAATT 
GATATCTAGC 
CATGGCAAAT 
TACTTGGAAG 
NNNNNNNNNN 
TTCCAATTGA 
GATATTGAAA 
CACTGTGATT 
CAGAATTGTG 
TGGGAAATAC 
ACATAACCCA 
TATAACTNTT 



41 
I 

GCGGTTTTTC 
TTGGGGGCAT 
TTTTTTTTTC 
TTCCTCAGTT 
AAATGCCCCC 
AACAGAGCTT 
CCAATGCTTA 
NATGATCAGC 
GCTAGCCCTT 
ACTGGTAGAT 
GACAATGGGG 
AAGAGAATAG 
TCCACCTCAT 
AAGGGGGTAC 
TCCACATGAA 



Seq ID NO: 44 DNA sequence 

Nucleic Acid Accession #: CAT cluster 



1 

I 

T l"ri w ri TTTT 
ATGAAGTACC 
GACAGGGCAG 
GAGTAATTCA 
CTAAAGGGCC 
TATAGCGACA 
TGGAGTGTTC 
GAATCCTATG 
AGGGACAAAA 



11 

I 

TTTTTTTGGA 
CACTAAAAGT 
GTGATGCTCT 
TAAACAACAG 
AGAATATTTG 
GCAGCAGTCT 
TAGAATCCTA 
TGAGGGACAA 
ATTCAGAACC 



21 
I 

TTTTAGTATG 
GACTGCTGTT 
CTTAGTCTCT 
AGATTATTGT 
GTGTTTGGTG 
TCAGGAATCC 
TGTGAGGGAC 
ACTTTCAAAC 
TTGTAGCAGT 



31 

i 

CCTTGCAATT 
AGTATAGCTT 
TTAGGCTACT 
TCACAGATCT 
AAGGTCAAAC 
TATGTGAGGG 
AAACATTCAG 
CCTTGTAGCA 
GTTCTGGAAT 



41 
I 

TTTTCCCTTT 
CAGTAATGAG 
ATTACAAAAT 
GGAGGCTGGA 
ATTCAGACAC 
ACAAACACTC 
ACCCCAGCAG 
GTGTTCTGGA 
CCTATGTGAG 



51 
I 

CTTGAGAAAT 
TCCTTTGTTT 
CTCCCAGATT 
AGGTTGACTT 
AGGATCCAAT 
GGAAAATAGG 
GAGGCAAGAG 
ATAGCAAAGA 
TTAAACAGCC 
AATAAAACCT 
GCACTTACTG 
AATTGTGCAT 
TCTACAACCT 
CTAACCCAAC 
ATACATTCAA 



51 
i 

ATTCTGATGC 
GTGATGAGGT 
ACTTCAGACT 
AAGTACAAGA 
TCTCAACGAC 
AGAAG CCAGC 
TAGTGTTGTG 
ATCCTATGTG 
GAACAATCA 



Seq ID NO: 45 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 31.. 10 92 



2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 



60 
120 
180 
240 
300 
360 
42 0 
480 
540 
600 
660 
720 
780 
840 
900 
960 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



60 
120 
180 
240 
300 
360 
420 
480 



11 
I 



21 



31 
I 



41 



199 



BNSDOCID <WO 02098358A2_L> 



WO 02/098358 



PCT/US02/17594 



GCCAAGCACG AAGGGTTCCG CGCGCCTTCC TCAGAGCCTG CGCACCCTGT TCGACATCCT 60 

GACGACCGGC GGCGCGGCTG CGTGCACCTG CGCACCTCCT TCTTGAAGCG CCGACGGCGG 120 

GCCCCCCGGG ACCCCACGCG CGCCCCGGCC CGGCCCGGGG ATCAGCCGCC GCCGCCGCCG 180 

CAAGCGCCTT GGTGTTCGCT CCGGCCGAOG AGCCGOGGAC GGTCCTGGAG AGGAAGCCCC 240 

5 TGCCCCTGGG CGTGCGCGCC CCTCTGGCCG GTCCCAGCGC CGCCGCCCGC AGCCCGGAGC 300 

AGCTGTGCGC CCCGGCTGAG GCGGCGCCCT GCCCCGCGGA GCCCGAGCGG TCCCAGAGCG 360 

CGGCGCTGGA ACCGAGCTCC AGCGCGGACG CAGGGACGGG ACCGGGGAGC GGCTCTTCGT 420 

GGACTCCGGA TCCTGGAGGG CGCCTCGGCG GGTCTGGAGG GAAAGGGCTT CATGCTTGTC 480 

GCGGTGCAGT GGCGTGCAGG GCCCTGGAGC GGACTCAGGG GATGCCCGGC GGGCTCCCCG 540 

10 TGCCCGAGGG GAACGTCGGA GGCACACCAT CGCCAGCGGC GTGGACTGCG GCCTGCTGAA 600 

GCAGATGAAG GAGCTGGAGC AGGAGAAGGA GGTGCTGCTG CAGGGTTTGG AGATGATGGC 660 

GCGGGGCCGC GACTGGTACC AGCAGCAGCT GCAACGAGTG CAGGAGCGCC AGCGCCGCCT 720 

GGGCCAGAGC AGAGCCAGCG CCGACTTTGG GGCTGCAGGG AGCCCCCGCC CACTGGGGCG 780 

GCTACTGCCC AAGGTACAAG AGGTGGCCCG GTGCCTGGGG GAGCTGCTGG CTGCAGCCTG 840 

15 TGCCAGCCGG GCCCTGCCCC CGTCCTCCTC CGGGCCCCCC TGCCCTGCCC TGACGTCCAC 900 

CTCACCCCCG GTCTGGCAGC AGGAGACCAT CCTCATGCTG AAGGAGCAGA ACCGACTCCT 960 

CACCCAGGAG GTGACCGAGA AGAGTGAGCG CATCACGCAG CTGGAGCAGG AGAAGTCGGC 1020 

GCTCATTAAG CAGCTGTTTG AGGCCCGCGC CCTGAGCCAG CAGGACGGGG GACCTCTGGA 1080 

TTCCACCTTC ATCTAGTCCT TGTGGGCCGC GTGGGCCCCC AGGGCCAGCC TGGCACTCAG 1140 

20 CCCTTCGAGG GTGGGCGCCC CATCGCACCC ACCCTCTCTG GCTGGAGACC CCCGGCAGGC 1200 

CCAGGCACAG TCCCGGAGTG GGCGCCTTCC TGCCGCCCTT GCCAGATGGG CTCCCCAGGC 1260 

CTGCCCCCGG CTGGTCCCCG CACCGAGCGC TTGACTCCGT TTTGGCTCCT GGTTGYTGAC 1320 

ATGGGCTGGG GGCTCTCTTG AGTCCGCATA GTCCGCAGCT ACTACTGGCC GCTGTCAGTG 1380 

GACAGTGGGG TACCCCTCCA TGAGTTAGCG TCCCCCCGTT TCCAGCGGTG CCGCCCTGGG 1440 

25 TCCCATCTTC AGGGAAAGGC ACTGCCCACG CCAGGCTGCA CTTCCAACAA CGGGCAGCAG 1500 

AGGGCGCGGG GCGGCTCCGA CGCGGGTCCA AGGGCAGCTT CCCGCTCAAC CAGGGCACCA 1560 

GGACGAGGTG GCTGTAGCTC GGACGGACGG AAGTAGATGG AGGGGGTGGG GACGGCCTGT 1620 

AAGCGGGGGG TGCCTGCCTG GCTGGGGAGC CCCAGGGATA GCGGTCGGAC TTCAGGTTCT 1680 

GGCCAAGGCT GAGGGACCCT GGCTGCAGCG GATCGGCACG CCGGGTGGGC GAGAGCTTGG 1740 

30 CCTGCATGTG CCTCCCACAG ACCCTGGGGT GATGGCCTTC CCCCTCTTGG CCGGGACGTT 1800 

GCCCCACGTT GAGTCCCACA CAACATCCTG TGAGCCTGGC TCCCCAGGAG GGCCCCCAGA 1860 

CAGCTCCCAG GCACGTCATA GGCAAAGCCT GTTTCCCCCG ACTCAGGATT TCCAAGGCCT 1920 

GGGGTCCTGC TCACCCCCCT TTGCTCTCAC GCCCAGCCTG TCCCCAGGTT TCAGCTGGGA 1980 

GAGGCCACCT CCCTCAGCCA AGGAAAACGA GAACCCCCAG GGTACAGGAG GAGGCTGGGG 2040 

35 CAGGTCCCCT TGGGTGT CAC TCCCTCAGCC CCTGCCCAGG CCCACTCCCG CTGGTGCTGG 2100 

AGTACGCACT GGTGGGGGGG CCCTGCTCAG CCCAACCTGG AGGGTCCCAG TGTCACCAGA 2160 

ACCAGGGGCA CGGCAACAGC ATCGATGGGT TCTGCAGCCC AGGGCCCCCG ATGCGGGGTC 2220 

AGTGTGTGTG GGGCGCAGGG CCTCCGATGC GGGGTCAGTG CGTGGGGGGC GCAGGGCCCC 2280 

CGATGCGGGG TCAGTGCGTG GGGGGCGCAG GGCCCCCTCG TGTCCAGGGC ACTTTGGTAC 2340 

40 ACTGTCCCAC AAGGCACCTG TCTCAGAGGA GGGGCCCTGG CAGGCAGCGT GGCAACTCCT 2400 

TCCGGAGCCC AGCTCCATGC TAACCTGCCC ACAGCAACCC CACAGAGCCA CATTCCCTGC 2460 

TGCACCTGGT CTGCAGGGTG TCCCAGGACA GGCCCAAGTC AGCCCAGCAT GCAGCTGCCC 2520 

TCCTACCCTG AAGATGGGAG TGGGCTTTCC AGGGGACATA AGGATGTCAG GCCTGGACCT 2580 

CCTGGGCAGG AAAGGGTGCA GGTCCTGAGG GCCTGTGCCC CACAGCCCCA GCACCCAGGT 2640 

45 GGACTGCAGC GCAGTGGGTG GGCCAGTGGC AGCCAGGGAG AAGCCCCCCG TCAGCAGGCT 2700 

GGGGTCTGCC CACCAGGGCC TCCCCACGTC TGCCTTTGAG GGTGCCTGCC ATGCCCTGGG 2760 

GGATCCTGGC ATCTTTACTG GACTGGAAGC AGGAGACAGA ACAGTGTCTG TCCCGGGGTG 2820 

ACTTCATCAG GAGACCGCCC ACATAGAGCT GGACCCCGCA GCTGAAGCGG AAATGTGAGA 2880 

CAGGCTGGCA CCTCCGGAAA AACTGCCTTT CAGCCTTGGT GTTCCGTGCA AGGTGAAAAG 2940 

50 AAATAGGTCC TCCCAGTTTA CAGCTTGAAA TCAGGCTAGT GAGTGGCCCT GGAGACCACG 3000 

AGGGGAGAAT TTAAAGGCCC CGGCTGGCAG GGTCTAGGTG GCTGGCAGAG GCACATGCAG 3060 

ACCCTGCCTG GAGCCTGCCC TAGGACGCTG GGCGGGTCAG TCTCCGTGCA GGATGTGAGC 3120 

AGCGTCCCTG GGCTCTATCC GCGAGGTGCC AGTAGCGTGT GCAGGTACAT ACACGTGCGT 3180 

GCACACTGTG ATGACACCCG GAAATGTCTC AGGATGTTGA AATGTGTCCT TGGGGGCAGA 3240 

55 AGTGTCCCCA GTTGAGAATC TGCCCCAGAG GAACACACCC ACACCAGGCC TCAGGATTTT 3300 

GTGTTGATCA AGTTCCAAGG AAAAGGAACA TCTCAGCCGG GCGTGGTGGT TCACGCCTGG 3360 

AATCCCAGCA CTTGAGGCCA GGAGTTCCAG AGCAGCCTGG GCAACGCAGT GAGAGACCCC 3420 

ATCTCTACAA RAAAAAAAAA AGAAAGAAAG AAAATGAGAG ATCCAGGTTT AAAAATTCAT 34 BO 

AAACACCACA AGGAAACAAT ACACTATGAG ACCCAGCAGA AGCAACAGAT TGACTCTAGA 3540 

60 CCCAGATACT AGAATTATCA GAGAGAATAT AAAGTAACAG TGTTTTATAT ATCTAAAGAA 3600 
ATAAAAGAGA TTTCTGGAAA CATGAAAAAA AA 



65 



Seq 10 NO: 46 Protein sequence 
Protein Accession #: Eos sequence 



1 11 21 31. 41 51 

I I I I I I 

MKVESRGPPS CWLRARASNS CLMSADFSCS SCVMRSLFSV TSWVRSRFCS FSMRMVCCCQ 60 
TGGEVDVRAG QGGPEEDGGR ARLAQAAASS SPRHRATSCT LGSSRFSGRG LPAAPKSALA 120 
70 LLWPRRRWRS CTRCSCCWYQ SRPRAIISKP CSSTSFSCSS SFICFSRPQS TPLAMVCLRR 180 
SPRARGARRA SPESAPGPCT PLHRDKHEAL SLQTRRGALQ DPESTKSRSP VPSLRPRWSS 240 
VPAPRSGTAR APRGRAPPQP GRTAAPGCGR RRWDRPEGRA RPGAGASSPG PSAARRPERT 300 
PRRIiRRRRRL IPGPGRGARG VPGGPPSALQ EGGAQVHAAA PPWRMSNRV RRL 

75 Seq ID NO: 47 DNA sequence 

Nucleic Acid Accession #: NM_02Q9S7.1 
Coding sequence: 1156.. 3486 

1 11 21 31 41 51 

80 | | I | | | 

CAAAGCTCTA AGTATGCTGG GACAGATACT ACAAATGAAC TTTATGATGA GCGAATTAAC 60 
CTGATTTATA GTCCTGTACT TTCTCTACGT GCCATATCCA TTATTAAAGA AATGAGTCTA 120 
AGTAGGAAGT AGAGTTAACC TATAGTTTCA TTTCTTGAAT TTCTTATTCT CTTTCTTCAG 180 

200 



BNSDOCID: <WO 02098358A2_L> 



WO 02/098358 



TCTTTTTCAG TTAACCTACA CACACACACA CACACACACA CACACACACA CACATATGTT 240 

TATAAGTGGG ATGGGAGAAC GGGTAOGGTG ATAATTAAAA GAGGTAAGGT TTCTCTTGAG 300 

ATGAAAATGT TCTAAAATTG TGATGGCGGA TGCACACCTC TGAATATATT AAAAGCCATT 360 

GAAATGAAAA AAGGGTGGGG GGAATCCAAA AGTGTAGCAG ACCCAACCTT GAGATTTGCT 420 

TGTTTGGGAA TGAATTTTCC AATAACTTGA AAGTTGTAAA AACTCACACT TCTCAGGGTT 480 

AGGTGTCAGA AAGAAAAGGA AGTAATTTAT TCTTTAATAA AGCAATTGTT AAATACTCTT 540 

TAGAACTACC ACTGATTGCA ATTTTGCAGT GTCTACTCAT AGTGTCTATA TAGGTACCAT 600 

GAAAAAGATG TACTTGTGAA ACTGTTCTCA TGTTACTTCA GAAAAATTTT GCTTCTAAGT 660 

GTGTATTCTA TGTCTGGTTA AATGTTCATT GAATTTTATT TAATCATTAA TCTCAACAGC 720 

ATTAAACAGT CAATAACATA AATGACAGTC TTCTCTTTGT ACTCCTCCCT GTACAACATC 780 

ACAGAGCTCC ATCTGTATAC ACGAAAGTCA CATGAAAATA GAACTCAGTG TTTTGTATTA 840 

CATAGTCTAT TCAGTACATT TAGAAGTATT TTGCCTCCAA TATTCAACCA CAGTAAAAGA 900 

CTCAGTGAGA ACGCGTGGTG GCGCTGCAGG TTAAGATGAC GGAAAATACA ACTGCCTACG 960 

CAGCTCCAGG ATCCAGCAAA CCGTTTCCCA AAGCCTGGAA GCAAAAGAAT AGCTGAGCCA 1020 

GAGCGAACGT GAGTGTGAAA CCTCTTTAAG ACACCGTTGG GCTGCTTGGT TCTGACATTC 1080 

TGGACTGCAA AACAGTTCTA CTAGGATCCT GGGGATACAT GAAGCTTCTG TGAACCAACT 1140 

TTTCAAGAAA AAGCAATGGA GATTGGATGG ATGCACAATC GGAGACAAAG GCAAGTCCTT 1200 

GTTTTCTTTG TTTTGCTGAG CTTGTCTGGG GCGGGCGCCG AGTTGGGGTC CTATTCCGTA 1260 

GTGGAAGAAA CGGAGAGAGG CTCTTTTGTG GCAAATCTAG GAAAAGACCT GGGGTTGGGG 1320 

TTGACAGAGA TGTCCACCCG CAAGGCCAGG ATCATTTCCC AGGGGAACAA ACAGCATTTG 1380 

CAGCTCAAGG CTCAAACTGG GGATTTGCTC ATAAATGAGA AGCTAGATCG AGAGGAGCTA 1440 

TGCGGTCCCA CTGAGCCTTG CATACTACAT TTCCAAGTGT TAATGGAAAA CCCTTTAGAA 1500 

ATATTTCAGG CTGAACTGAG GGTGATAGAT ATAAATGACC ATTCTCCCAT GTTCACTGAA 1560 

AAGGAAATGA TTCTAAAAAT ACCGGAAAAC AGTCCTCTAG GAACTGAGTT CCCTCTGAAT 1620 

CATGCTTTGG, ACTTGGACGT AGGAAGCAAT AATGTTCAAA ACTATAAAAT CAGCCCAAGC 1680 

TCTCATTTCC GGGTTCTAAT CCATGAATTC AGAGATGGCA GGAAATACCC TGAGCTAGTG 1740 

TTGGATAAAG AGCTGGATCG GGAGGAGGAG CCTCAACTAA GATTAACCCT GACAGCGCTG 1800 

GATGGTGGCT CTCCACCGCG ATCTGGAACT GCTCAGGTCC GTATTGAAGT GGTGGACATC 1860 

AATGATAACG CTCCTGAGTT TGAGCAGCCC ATCTACAAAG TGCAGATTCC AGAGAACAGT 1920 

CCTCTTGGCT CCCTGGTTGC CACCGTCTCC GCCAGGGATT TAGACGGCGG AGCCAATGGA 1980 

AAAATATCAT ACACACTCTT TCAGCCTTCG GAGGATATTA GTAAAACTTT GGAGGTAAAT 2040 

CCTATGACAG GGGAAGTTCG ACTGAGAAAG CAAGTAGATT TCGAAATGGT TACGTCTTAT 2100 

GAAGTGCGCA TCAAAGCCAC AGATGGGGGA GGTCTTTCAG GAAAGTGCAC TCTTCTCCTG 2160 

CAGGTGGTGG ACGTGAATGA CAATCCCCCA CAGGTGACCA TGTCTGCACT CACCAGCCCC 2220 

ATCCCAGAGA ACXCGCCTGA GATAGTAGTT GCTGTTTTCA GCGTTTCAGA TCCTGACTCC 2280 

GGAAACAATG GGAAGACGAT TTCCTCCATC CAGGAAGACC TTCCCTTTCT TCTAAAACCT 2340 

TCAGTCAAGA A CT T TT ACAC CTTGGTAACG GAGAGAGCAC TCGACAGAGA AGCAAGAGCT 2400 

GAATATAATA TCACCCTCAC CGTCACAGAT ATGGGGACTC CAAGGCTGAA AACGGAGCAC 2460 

AACATAACAG TGCAGATATC AGATGTCAAT GATAACGCCC CCACTTTCAC CCAAACCTCC 2520 

TACACCCTGT TCGTCCGCGA GAACAACAGC CCCGCCCTGC ACATCGGCAG CGTCAGCGCC 2580 

ACAGACAGAG ACTCAGGCAC CAACGCCCAG GTCACCTACT CGCTGCTGCC GCCCCAGGAC 2640 

CCGCACCTGC CCCTCGCCTC CCTGGTCTCC ATCAACGCAG ACAACGGCCA CCTGTTCGCC 2700 

CTCAGGTCGC TGGACTACGA GGCCCTGCGG GAGTTCGAGT TCCGCGTGAG CGCCACAGAC 2760 

CGCGGCTCCC CGGCTTTGAG CAGCGAGGCG CTGGTGCGCG TGCTGGTGCT GGACGCCAAC 2820 

GACAACTCGC CCTTCGTGCT GTACCCGCTG CAGAACGGCT CCGCGCCCTG CACTGAGCTG 2880 

GTGCCCCGGG CGGCCGAGCC GGGCTACCTG GTGACCAAGG TGGTGGCGGT GGACGGCGAC 2940 

TCGGGCCAGA ATGCCTGGCT GTCGTACCAG CTGCTCAAGG CCACGGAGCC CGGGCTGTTC 3000 

GGTGTGTGGG CGCACAATGG CGAGGTGCGC ACCGCCAGGC TGCTGAGCGA GCGCGACGCA 3060 

GCCAAGCAGA GGCTGGTGGT GCTGGTCAAG GACAATGGCG AGCCTCCGCG CTCGGCCACC 3120 

GCCACGCTGC ACGTGCTCCT GGTGGACGGC TTCTCCCAGC CCTTCCTGCC GCTCCCAGAG 3180 

GCGGCCCCCG GCCAGACCCA GGCCAACTCG CTCACTGTCT ACCTGGTGGT GGCGTTGGCC 3240 

TCGGTGTCGT CGCTCTTCCT CTTTTCGGTG CTCCTGTTCG TGGCGGTGCG GCTGTGCAGG 3300 

AGGAGCAGGG CGGCCTCGGT GGGCCGCTGC TCGATGCCTG AGGGCCCCTT TCCAGGGCGT 3360 

CTGGTGGACG TAAGCGGCAC CGGGACCCTG TCCCAGAGCT ACCAATACGA GGTGTGTCTG 3420 

ACAGGAGGCT CAGAAACAAG TGAGTTCAAG TTCCTGAAGC CGATTATCCC CAACTTCTCT 3480 

CCTTAGGGCA CTAGGAAAGA AATAGATTAA AATTCCACCC TTCACAATAG CTTTGGATTT 3540 

AATTATTGAT AGGAACCCAT TTGATAAATT CCTTAACTTC TTATGATTGT CTTGTTGATT 3600 

AAATTGTTCA TGCTCACCAC CACCAATAAG GTATTTTTCT CTGATTGTTA GTTCAAATTA 3660 

TATTGTTAAT TCCAGTTTCC CTTTTCCTCA TATTTACCCC GAAGAGGTGT TGCATATAGA 3720 

ATCCCAATTA ACAAAATATA CTTTATCTTC AAAGTTGATG TCATTTAAAA TTTTTCCGTC 3780 

TTTATATTTT ATTTACTTCC TATTCATTTT TTGCTCCATT TTTCATGTTA CTTCTCAGTT 3840 

TCCTAGAACT TCAAGTATTA AAATAACCTG TTGCATGTAT TAGGCATATT TCCTATGTTA 3900 

CATTTCTTTT GTCTATTTTC CTTTCAAAAT TGGTATTTTT GTTGGGCTCA ATTTTCATTA 3960 

TAATACTTTT CTTAAAGTTT CTTTCTTTCT TTTCTTTTCT TTCTTTTTTT TTTTTTCCTT 4020 

TTTGAGACAG GGTCTTACTC TTGTCACCCA GGCTGGAGTG CAGTGGCACA ATCTTGGCTC 40 80 

ACTGCAACCT CTGCCTCCTG GGCTCAACGG ATCCTTCCAC CTCAGCCTCC CAAGTAGCTT 4140 

GGACTATAGG TGCATGCCAC CATGCCTGGC TAATCTTTTG CAGCGATGAG ATTTTGCCAA 4200 

GTTGCCCAGG CTGATCTTGA ACTCCTGGGC TCAAGCCATC CTCCCTCCTC AGCCTCCCAA 4260 

AATTCTGGGA TTACAGGCAT AAGCCAATGT GCCCATCCAA AGTTTTATTT ATTTATTTTT 4320 

TTGAGATGGA GTCTCGTAAA GTTACCTTTA AAAAAAAAGT TCTATTTTCC CTGTATTGGT 4380 

ATCTCCTTAA ATAAAATAAA ATATTCCTAT TGTAAGTGAT ATGAGAAATC TTTAACCAGC 4440 

CTTATCTAAA AATAAAAAGA GAAGCCATTG TAAGACATTC AGTATGTGTA AATGTGTTTG 4500 

TGTTTGTAGA CAAAAGGCAA AGGTATTATG TAAAAATATT TAATAATTTA TTCTTTCTAT 4560 

TACTGAATTA AAAAATCAGA GGTCCCTGTT ATATTTTTAA TGGCTAACAA CTCAATCTCA 4620 

TTAAGTTGGA AAAAAAACTT ATCAAAGAGA CATTTACATG GTTTGGCTTT TATATTCATC 4680 

ATAGTATACA TTGGCGGTAT CTAGCCCTTT CTCTGTAAAA TATCCCTATG TTTAATCTGT 4740 

ATTTCTTGCT TATTATATGT AAAGTTGAGC TTCTTTCTAG ATATTAGGCC TTTGAATAAA 4800 
ATTCTATGTG AGTCAGAAAA AAAAAAA 

Seq ID NO: 48 Protein sequence 
Protein Accession #: NP_066008.1 

1 11 21 31 41 51 



201 



WO 02/098358 



PCT/US02/17594 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 

BNSDOCID: <WO_ 



I I I I I I 

MEIGWMHNRR QRQVLVFFVI* LSLSGAGAEL GSYSWEETE RGSFVANLGK DLGLGLTEMS 60 

TRKARIISQG NKQHLQUCAQ TGDLLINEKL DREELCGPTE PCILHFQVLM ENPLEIFQAE 120 

LRVTDINDHS PMFTEKEMII* KIPENSPLGT EFPLNHALDL DVGSNNVQNY KISPSSHFRV 180 

LIHEFRDGRK YPELVLDKEL DREEEPQLRL TLTALDGGSP PRSGTAQVRI EWDINDNAP 240 

EFEQPIYKVQ IPENSPLGSL VATVSARDLD GGANGKISYT LFQPSEDISK TLEVNPMTGE 300 

VRLRKQVDFE MVTSYEVRIK ATDGGGLSGK CTLLLQWDV NDNPPQVTMS ALTS PI PENS 360 

PEIWAVFSV SDPDSGNNGK TISSIQEDLP PIiLKPSVKNF YTLVTERALD REARAEYNIT 420 

LTVTDMGTPR LKTEHNITVQ I SD VNDNAPT FTQTSYTLFV RENNSPALHI GSVSATDRDS 480 

GTNAQVTYSL LPPQDPHLPL ASLVSINADN GKLFALRSLD YEALREFEFR VSATDRGSPA 540 

LSSEALVRVL VLDANDNSPF VLYPLQNGSA PCTKLVPRAA EPGYIiVTKW AVDGD SGQNA 600 

WLSYQLLKAT EPGLFGVWAH NGEVRTARLL SEFJDAAKQRL WLVKDNGEP PRSATATLHV 660 

LLVDGFSQPF LPLPEAAPGQ TQANSLTVYL WAbASVSSL FLFSVLLFVA VRLCRRSRAA 720 
SVGRCSMPEG PFPGRLVDVS GTGTLSQSYQ YEVCLTGGSE TSEFKFLKPI IPNFSP 

Seq ID NO: 4 9 DNA sequence 

Nucleic Acid Accession #: CAT cluster 

1 11 21 31 41 51 

I I I I I I 

TTTTTTTTTG ATAATACACA GACTTTAATT AAAATTGTAC TAAAATTAAA TGTCTAAATA 60 

AATTAGAATG GTACATGGTA CATCTAAATG TATGTTTATA TATTTTATTT GTGCATTTTA 120 

TTCCTAGGGT TGCTTTTGCT TTAGTTTGTA AAACGTTCTT ATTTTTATGA TAATGTAGTA 180 

TATACTAAAT AAAGAAAAAT CAGGAAATAG AAAATGAAGA AGAAAACATT AGCTATTGTC 240 

AACCAAATAA AAATTGTGCA ATCTCTAAGC ACATGAACTA TGTATTATTT GTACAGCATG 300 

TACAATGTTT ATGCTTCACA GGGTGAGGTA GAGACTGCAA AACATTGAAC CTGGGACAAA 360 

TAAGAAAGTA AGGAAATTTT CACAACATAT TAATATTATA GAAAATGTTG AACTTAACAG 420 
TTAAGATACA AGTAGTGAAA AATGATAGTA TTTAAGGAGA TCTAGAAAAT TTA 

Seq ID NO: 50 DNA sequence 

Nucleic Acid Accession #: AF034799.1 

Coding sequence: 170.. 3 943 

1 11 21 31 41 51 

I I 1 I I I 

GATTCCGGGA GGCAAGTGAG GAGAGAAGAT GCTGTAGCGT CCTCACCGGC TGCCAGCAGG 60 

GAAATGGTCC AGGAGTGCTG GGTGTGAGCC TCCCTTCTCC TCAAGCCGGA GACTGCGGTT 120 

GTCATTGATC AATTGAAGAA GCAAGGACCC GAAATCACAG ACATTAGCAA TGATGTGTGA 180 

AGTGATGCCC ACGATTAATG AGGACACCCC AATGAGCCAA AGGGGGTCCC AAAGCAGTGG 240 

CTCGGACTCA GACTCCCATT TTGAG CAGCT GATGGTGAAT ATGCTAGATG AAAGGGATCG 300 

TCTTCTAGAC ACCCTTCGGG AGACCCAGGA AAGCCTCTCA CTTGCCCAGC AAAGACTTCA 360 

GGATGTCATC TATGACCGAG ACTCACTCCA GAGACAGCTC AATTCAGCCC TGCCACAGGA 420 

TATCGAATCC CTAACAGGAG GGCTGGCTGG TTCTAAGGGG GCTGATCCAC CGGAATTTGC 480 

TGCACTGACA AAAGAATTAA ATGCCTGCAG GGAACAACTT CTAGAAAAGG AAGAAGAAAT 540 

CTCTGAACTT AAAGCTGAAA GAAACAACAC AAGACTATTA CTGGAGCATT TGGAGTGCCT 600 

TGTGTCACGA CATGAAAGAT CACTAAGAAT GACGGTGGTA AAACGGCAAG CCCAGTCTCC 660 

CTCAGGAGTA TCCAGTGAAG TTGAAGTTCT CAAGGCACTG AAATCTTTGT TTGAGCACCA 720 

CAAGGCCTTG GATGAAAAGG TAAGGGAGCG ACTGAGGGTT TCTTTAGAAA GAGTCTCTGC 780 

ACTGGAAGAA GAACTAGCTG CTGCTAATCA GGAGATTGTT GCCTTGCGTG AACAAAATGT 840 

TCATATACAA AGAAAAATGG CATCAAGCGA GGGATCCACA GAGTCAGAAC ATCTTGAAGG 900 

GATGGAACCT GGACAGAAAG TCCATGAGAA GCGTTTGTCC AATGGTTCTA TAGACTCAAC 960 

CGATGAAACT AGTCAAATAG TTGAACTACA AGAATTGCTT GAAAAGCAAA ACTATGAAAT 1020 

GGCCCAGATG AAAGAACGTT TAGCAGCCCT TTCTTCCCGA GTGGGAGAGG TGGAACAGGA 1080 

AGCAGAGACA GCAAGAAAGG ATCTCATTAA AACAGAAGAA ATGAACACCA AGTATCAAAG 1140 

GGACATTAGG GAGGCCATGG CACAAAAGGA AGATATGGAA GAAAGAATTA CAACCCTTGA 1200 

AAAGCGTTAC CTCAGTGCTC AGAGAGAATC TACCTCCATA CATGACATGA ATGATAAACT 1260 

AGAAAATGAG TTAGCAAATA AAGAAGCTAT CCTACGGCAG ATGGAAGAGA AAAACAGACA 1320 

GTTACAAGAA CGTCTTGAGC TAGCTGAAGA AAAGTTGCAG CAGACCATGA GAAAGGCTGA 1380 

AACCTTGCCT GAAGTAGAGG CTGAACTGGC TCAGAGAATT GCAGCCCTAA CCAAGGCTGA 1440 

AGAGACACAT GGAAATATTG AAGAACGTAT GAGACATTTA GAGGGTCAAC TTGAAGAGAA 1500 

GAATCAAGAA CTTCAAAGAG CTAGGCAAAG AGAGAAAATG AATGAGGAGC ATAACAAGAG 1560 

ATT AT CGGAT ACGGTTGATA GACTTCTGAC TGAATCCAAT GAACGCCTAC AACTACACTT 1620 

AAAGGAAAGA ATGGCTGCTC TAGAAGAAAA GAATGTTTTA ATTCAAGAAT CAGAAACTTT 1680 

CAGAAAGAAT CTTGAAGAAT CTTTACATGA TAAGGAAAGC TTAGCAGAAG AAATTGAAAA 1740 

GCTGAGATCT GAACTTGACC AATTGAAAAT GAGAACTGGC TCTTTAATTG AACCCACAAT 1800 

ACCAAGAACT CATCTAGACA CCTCAGCTGA GTTGCGGTAC TCAGTGGGAT CCCTAGTGGA 1860 

CAGCCAGTCT GATTACAGAA CAACTAAAGT AATAAGAAGA CCAAGGAGAG GCCGCATGGG 1920 

TGTGCGAAGA GATGAGCCAA AGGTGAAATC TCTTGGGGAT CACGAGTGGA ATAGAACTCA 1980 

ACAGATTGGA GTACTAAGCA GCCACCCTTT TGAAAGTGAC ACTGAAATGT CTGATATTGA 2040 

TGATGATGAC AGAGAAACAA TTTTTAGCTC AATGGATCTT CTCTCTCCAA GTGGTCATTC 2100 

CGATGCCCAG ACGCTAGCCA TGATGCTTCA GGAACAATTG GATG CCATCA ACAAAGAAAT 2160 

CAGGCTAATT CAGGAAGAAA AAGAATCTAC AGAGTTGCGT GCTGAAGAAA TTGAAAATAG 2220 

AGTGGCTAGT GTGAGCCTCG AAGGCCTGAA TTTGGCAATG GTCCACCCAG GTACCTCCAT 2280 

TACTGCCTCT GTTACAGCTT CATCGCTGGC CAGTTCATCT CCCCCCAGTG GACACTCAAC 2340 

TCCAAAGCTC ACCCCTCGAA GCCCTGCCAG GGAAATGGAT CGGATGGGAG TCATGACACT 2400 

GCCAAGTGAT CTGAGGAAAC ATCGGAGAAA GATTGCAGTT GTGGAAGAAG ATGGTCGAGA 2460 

GGACAAAGCA ACAATTAAAT GTGAAACTTC TCCTCCTCCT ACCCCTAGAG CCCTCAGAAT 2520 

GACTCACACT CTCCCTTCTT CCTACCACAA TGATGCTCGA AGTAGTTTAT CTGTCTCTCT 2580 

TGAGCCAGAA AGCCTCGGGC TTGGTAGTGC CAACAGCAGC CAAGACTCTC TTCACAAAGC 2640 

CCCCAAGAAG AAAGGAATCA AGTCTT CAAT AGGACGTTTG TTTGGTAAAA AAGAAAAAGC 2700 

TCGACTTGGG CAGCTCCGAG GCTTTATGGA GACTGAAGCT GCAGCTCAGG AGTCCCTGGG 2760 

GTTAGGCAAA CTCGGAACTC AAGCTGAGAA GGATCGAAGA CTAAAGAAAA AGCATGAACT 2820 

TCTTGAAGAA GCTCGGAGAA AGGGATTACC TTTTGCCCAG TGGGATGGGC CAACTGTGGT 2880 
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10 
15 
20 
25 
30 
35 
40 
45 



CGCATGGCTA 
CGTGAAGAGT 
AATCAGCAAT 
AACAAGTCCT 
TGAAGAAATG 
GGCCCAGTGT 
TGGAAATGAA 
CTTGGTAGAT 
AATGGTGGAT 
GAATTATGAC 
CGTGTTGGTG 
ATATGCAAAT 
CTTTGACTAC 
GCAGATTCTT 
TGAAAGTGAT 
TGAAGTACAT 
GTTAACCACA 
GCAGAGGTTA 
GCAGCACTGA 
AAGAAGACGA 



GAGCTTTGGT 
GGTGCCATCA 
CCACTGCATC 
TCAGCTCCTC 
GAAAATCTTG 
CCGGTTTTTC 
TGGCTTCCCA 
GCAAGAATGT 
AGTTTCCATC 
AGAAAAGAAC 
TGGAGCAATG 
AATATACTTG 
AGCAGCTTAG 
GAAAGAGAAT 
GACAAGAACT 
GGAATCAGCA 
ACCTCTGGGC 
GACAACTCCA 
CCTGCTATGG 
GCAGTGAAAA 



TGGGAATGCC 
TGTCTGCTTT 
GCTTAAAACT 
CAACATCTCG 
CAGCTCCAGC 
TACAGACCCT 
GCTTGGGGTT 
TAGATCACCT 
GAACAAGTTT 
TAGAAAGAAG 
ACCGAGTTAT 
AGAGCGGTGT 
CTTTATTATT 
ACAATAACCT 
TCAGACGTGG 
TGATGCCTGG 
AGTCAAGAAA 
CTGTTCGCAC 
CGTCTTTTCA 
CCTTTGTGAA 



Seq ID NO: 51 Protein sequence 
Protein Accession #: AAC26100.1 



MMCEVMPTIN 
QRI*QDVIYDR 
EEEISELKAE 
FEHHKALDEK 
HLEGMEPGQK 
VEQEAETARK 
NDKLENELAN 
TKAEETHGNI 
QLHLKERMAA 
EPTIPRTHLD 
NRTQQIGVLS 
NKEIRLIQEE 
GHSTPKLTPR 
AliRMTHTIiPS 
KEKARLGQLR 
PTWAWLELW 
MVSLTSPSAP 
HEWIGNEWLP 
LKRLNYDRKE 
LDENFDYSSL 
FPPREVHGIS 



11 
I 

EDTPMSQRGS 
DSLQRQLNSA 
RNNTRLLLEH 
VRERLRVSLE 
VHEKRLSNGS 
DLIKTEEMNT 
KEAILRQMEE 
EERMRHLEGQ 
LEEKNVLIQE 
TSAELRYSVG 
SHPFESDTEM 
KESTELRAEE 
SPAREMDRMG 
SYHNDARSSL 
GFMETEAAAQ 
LGMPAWYVAA 
PTSRTPSGNV 
SLGLPQYRSY 
UERRREASQH 
ALLLQIPTQN 
MMPGSSETLP 



21 
I 

QSSGSDSDSH 
IiPQDIESLTG 
IiECLVSRHER 
RVSALEEELA 
IDSTDETSQI 
KYQRDIREAM 
KNRQLQERLE 
LEEKNQELQR 
SETFRKNLEE 
SIrVDSQSDYR 
SDIDDDDRET 
IENRVASVSL 
VMTLPSDLRK 
SVSLEPESLG 
ESLGLGKLGT 
CRANVKSGAI 
WVTHEEMENL 
FMECLVDARM 
EIKDVLVWSN 
TQARQ I LERE 
AGFRLTTTSG 



TGCGTGGTAC 
ATCTGACACT 
TCGATTAGCA 
AACTCCTTCA 
AAAAACGAAA 
GGCTTATGGA 
ACCTCAGTAC 
AACAAAAAAA 
ACAATATGGA 
ACGGGAAGCA 
TCGCTGGATA 
GCATGGCTCA 
ACAGATTCCA 
CTTGGCCCTG 
ATCAACCTGG 
GTCCTCAGAA 
AATGACAACA 
ATACTCATGT 
GTCTACTCTA 
AACTGAATTC 



31 
I 

FEQLMVNMLD 
GLAGSKGADP 
SI*RMTWKRQ 
AANQEIVAIiR 
VBLQELLEKQ 
AQKEDMEERI 
LAEEKLQQTM 
ARQREKMNEE 
SliHDKESLAE 
TTKVIRRPRR 
IFSSMDLLSP 
EGLNLAMVHP 
HRRKIAWEE 
LGSANSSQDS 
QAEKDRRLKK 
MSALSDTEIQ 
AAPAKTKESE 
LDHLTKKDLR 
DRVIRWIOAI 
YNNLLALGTE 
QSRKMTTDVA 



GTGGCAGCCT 
GAGATCCAGA 
ATCCAGGAGA 
GGCAACGTTT 
GAATCTGAGG 
GATATGAATC 
AGAAGTTACT 
GATCTCCGTG 
ATTATGTGCT 
AGCCAACATG 
CAAGCAATTG 
CTTATAGCCC 
ACACAGAACA 
GGAACTGAAA 
AGAAGGCAGT 
ACATTACCAG 
GATGTTGCTT 
TGACCAGCCA 
CCTAAAGTGC 



41 
I 

ERDRLLDTLR 
PEFAALTKEL 
AQSPSGVSSE 
EQNVHIQRKM 
NYEMAQMKER 
TTLiEKRYLSA 
RKAETLPEVE 
HNKRLSDTVD 
EIEKLRSELD 
GRMGVRRDEP 
SGHSDAQTIiA 
GTSITASVTA 
DGREDKATIK 
LHKAPKKKGI 
KHEIiLEEARR 
REIG1 SNPliH 
EGSWAQCPVF 
VHLKMVDSFH 
GLREYANNIL 
RRLDESDDKN 
SSRLQRLDNS 



GCCGAGCCAA 
GAGAAATTGG 



GGGTGACTCA 
AAGGAAGCTG 
ATGAGTGGAT 
TTATGGAATG 
TCCATTTAAA 
TAAAGAGGTT 
AAATAAAAGA 
GACTTCGAGA 
TGGATGAAAA 
CCCAGGCAAG 
GGCGACTGGA 
TTCCTCCTCG 
CTGGATTTAG 
CATCAAGACT 
CTCAAAGGAG 
ACTACCATCT 



SI 
I 

ETQESLSLAQ 
NACREQLLEK 
VEVLKALKSL 



LAAXiSSRVGE 
QRESTSIHDM 
AELAQRIAAL 
RIiLTE SNERL 
QLKMRTGSI*! 
KVKSLGDHEW 
MMLQEQLDAI 
SSLASSSPPS 
CETSPPPTPR 
KSSIGRLFGK 
KGLPFAQWDG 
RXjKLRLAXQE 
LQTIiAYGDMN 
RTSLQYGIMC 
ESGVHGSLXA 
FRRGSTWRRQ 
TVRTYSC 



2940 
3000 
3060 
3120 
3160 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
760 
840 
900 
960 
1020 
1080 
1140 
1200 
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WHAT IS CLAIMED IS: 

1 . A method of detecting an androgen-independent prostate cancer cell in a sample from 
a patient having undergone androgen ablation therapy, the method comprising deteimining 
the presence or absence of a nucleic acid comprising a sequence at least 80% identical to a 
sequence as shown in Tables 1 A-4. 

2. The method of claim 1 , wherein said determining is by hybridizing with a 
polynucleotide that selectively hybridizes to a sequence at least 95% identical to a sequence 
as shown in Tables 1 A-4. 



10 3 . The method of claim 1 , wherein the biological sample: 

a) is a tissue sample; or 

b) comprises isolated nucleic acids. 



4. The method of claim 3 : 
15 a) wherein the nucleic acids are mRNA; or 

b) further comprising the step of amplifying nucleic acids before the step of 
contacting the biological sample with the polynucleotide. 



5. The method of claim 2, wherein the polynucleotide: 
20 a) comprises a sequence as shown in Tables 1 A-4; 

b) is labeled, including a fluorescent label; or 

c) is immobilized on a solid surface. 

6. The method according to claim 1, wherein said biological sample is contacted with a 
25 plurality of polynucleotides that each selectively hybridizes to a sequence at least 95% 

identical to a first sequence as shown in Tables 1 A-4. 

7. The method according to claim 6,wherein said plurality of polynucleotides are 
immobilized on a solid surface. 

30 
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8. An isolated polypeptide which is encoded by a nucleic acid molecule having 
polynucleotide sequence as shown in Tables 1 A-4. 

9. An antibody that specifically binds a polypeptide of claim 8. 

35 

1 0. The antibody of claim 9: 

a) further conjugated to an effector component, including a fluorescent label a 

radioisotope or a cytotoxic chemical; or 

b) which is an antibody fragment or humanized antibody. 

40 

11. A method of detecting an androgen-independent prostate cancer cell in a patient 
having undergone androgen ablation therapy, the method comprising contacting a samp 
from said patient with an antibody of claim 9. 

45 12. The method of claiml 1 , wherein: 

a) the antibody is further conjugated to an effector component, e.g., a fluorescei 

label; or. 

b) said sample comprises a cell. 

50 13. A method of detecting antibodies specific to androgen-independent prostate cam 
a patient having undergone androgen ablation, the method comprising contacting a biol< 
sample from the patient with a polypeptide encoded by a nucleic acid comprising a seqv 
from Tables 1 A-4. 

55 14. A method of inhibiting proliferation of androgen-independent prostate cancer ce 
a patient having undergone androgen ablation therapy, the method comprising administ* 
to the patient a therapeutically effective amount of a compound that specifically elimina 
cells expressing an antigen listed in Tables 1 A-4. 

60 15. The method of claim 14, wherein the compound is an antibody. 

16. A drug screening assay comprising the steps of: 
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a) admiaisteiing a test compound to a mammal having a prostate proliferative 
condition or a cell isolated therefrom; 

65 b) comparing the level of gene expression of a polynucleotide that selectively 

hybridizes to a sequende at least 80% identical to a sequence as shown in 
Tables 1 A-4 in a treated cell or mammal with the level of gene expression of 
the polynucleotide in a control cell or mammal, wherein a test compound that 
modulates the level of expression of the polynucleotide is a candidate for the 

70 treatment of prostate cancer. 

17. The assay of claim 16, wherein: 

a) the control is a mammal with prostate cancer or a cell therefrom that has not been 

treated with the test compound; or 
75 b) the control is a normal cell or mammal. 

18. A method for treating a mammal having a prostate proliferative condition or prostate 
cancer comprising administering a compound identified by the assay of claim 16. 

80 19. A pharmaceutical composition for treating a mammal having a prostate proliferative 
condition or prostate cancer, the composition comprising a compound identified by the assay 
of claim 16 and a physiologically acceptable excipient. 

20. A method of detecting a prostate cancer associated transcript, the method comprising 
85 contacting a biological sample from the patient with a plurality of polynucleotides wherein at 

least two of said polynucleotides selectively hybridize to a difference sequence at least 80% 
identical to a sequence as shown in Tables 1 A-4. 

21. A method of detecting a prostate cancer, the method comprising the steps of: 
90 a) providing a biological sample from a patient; 

b) contacting the biological sample with a first polynucleotide that selectively 

hybridizes to a sequence at least 80% identical to a first sequence as shown in 
Tables 1 A-4, to determine the level of a prostate cancer-associated transcript 
in the biological sample; and with a second polynucleotide that selectively 
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95 hybridizes to a second sequence at least 80% identical to a sequence not 

shown in Tables 1A-4; wherein the expression of said second sequence is not 
substantially changed in prostate cancer, to determine the level of expression 
of a control transcript in the biological sample; and 
c) comparing the level of the prostate cancer-associated transcript to a level of the 
1 00 normal tissue associated transcript in the biological sample. 

22. A method for quantitation of a prostate cancer-associated transcript in a cell from a 
patient, the method comprising contacting a biological sample from the patient with a 
polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

1 05 as shown in Tables 1 A-4. 

23. The method of claim 22, wherein: 

a) the polynucleotide selectively hybridizes to a sequence at least 95% identical to a 
sequence as shown in Tables 1 A-4; 
110 b) the biological sample is a tissue sample; 

c) the biological sample comprises isolated nucleic acids; 

d) the nucleic acids are mRNA; 

e) further comprising the step of amplifying nucleic acids before the step of 

contacting the biological sample with the polynucleotide; 
115 f) the polynucleotide comprises a sequence as shown in Tables 1 A-4; 

g) the polynucleotide is labeled, including a fluorescent label; or 

h) the polynucleotide is immobilized on a solid surface. 

24. A biochip comprising a plurality of polynucleotides that selectively hybridize to a 
120 sequence at least 80% identical to a sequence as shown in Tables 1A-4. 

25 . A method of screening drug candidates comprising: 

a) providing a cell that expresses an expression profile gene selected from the group 

consisting of an expression profile gene set forth in Tables 1 A-4 or fragment 
125 thereof; 

b) adding a drug candidate to said cell; and 
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c) determining the effect of said drug candidate on the expression of said expression 
profile gene. 

130 26. A method according to claim 22 wherein said detennining comprises comparing the 
level of expression in the absence of said drug candidate to the level of expression in the 
presence of said drug candidate. 
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This international report has not been established in respect of certain claims under Article 17(2)(a) for the following reasons: 
Claim Nos.: 

because they relate to subject matter not required to be searched by this Authority, namely: 



* □ 



Claim Nos.: 

because they relate to parts of the international application that do not comply with the prescribed requirements to such 
an extent that no meaningful international search can be carried out, specifically: 



Claim Nos.: 

because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 
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covers only those claims for which fees were paid, specifically claims Nos.: 1-26 with respect to U83115 of Table 1A 
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BOX EL OBSERVATIONS WHERE UNITY OF INVENTION IS LACKING 

Group I, claim(s) 1-7 and 20-24, drawn to methods of detecting a prostate cancer cell or associated transcript. 

Group II, claim(s) 8 and 13, drawn to a polypeptide and method of use thereof. 

Group HI, claim(s) 9-12, drawn to an antibody and method of use thereof. 

Group IV, claim(s) 14 and 15, drawn to methods of administering a compound which eliminates cells expressing an antigen. 

Group V, claim(s) 16-17 and 25-26, drawn to screening methods for a test compound which modulates expression of particular genes. 

Group VI, claim(s) 18-19, drawn to a pharmaceutical compound and method of use thereof in the treatment of a mammal. 

This application contains claims directed to more than one species of the generic invention. These species are deemed to lack unity of 
invention because they are not so linked as to form a single general inventive concept under PCT Rule 13.1. 

In order for more than one species to be examined, the appropriate additional examination fees must be paid. The species are as follows: 

each of the sequences as shown in Tables 1 A-4 represents a single invention. 

The claims are deemed to correspond to the species listed above in the following manner: N/A 

The following claim(s) are generic: all of claims 1-26. 

The inventions listed as Groups I- VI do not relate to a single general inventive concept under PCT Rule 13.1 because, under PCT Rule 
13.2, they lack the same or corresponding special technical features for the following reasons: the claims of the various groups are not so 
related as to support unity of invention. Each group has its own special technical feature which is not shared with any other group as 
follows: I - determining the presence or absence of a nucleic acid; II - polypeptide and method of use thereof; HI - antibody and method 
of use thereof; IV - use of a compound to eliminate cells expressing a specific antigen; V - methods of drug screening based on 
modulation of gene expression; and VI - pharmaceutical compositions and methods of use thereof in treating a prostate proliferative 
condition of prostate cancer. Thus, the claims lack the same or corresponding special technical features. 

The species listed above do not relate to a single general inventive concept under PCT Rule 13. 1 because, under PCT Rule 13.2, the 
species lack the same or corresponding special technical features for the following reasons: the plethora of sequences as shown in Tables 
1A-4 have no relation whatsoever with one another and thus clearly do not share any special technical features. As each sequence is 
different from and independent of the others, the particular sequence and biological function of each such sequence constitutes its own 
special technical features. 
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